Rewind

Is Rewind Down Right Now? Check if there is a current outage ongoing.

Rewind is currently Operational

Last checked from Rewind's official status page

Historical record of incidents for Rewind

Report: "Rewind Alerts Experiencing Degraded Performance"

Last update
resolved

This incident has been resolved.

investigating

We're investigating potential degraded performance with the Rewind Alerts app.

Report: "The Rewind Staging web application is unavailable"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating possible issues with the Rewind Staging web application.

Report: "QuickBooks Online Backups Not Running"

Last update
resolved

This incident has been resolved.

monitoring

A workaround has been implemented and we are monitoring the situation. We are rerunning the nightly backups that failed last night. We will leave status update up for the rest of the day. We are expecting tonight's nightly backups to progress as expected. This affects only QuickBooks customers.

investigating

We are exploring a possible mitigation and are in the process of deploying it. This affects only QuickBooks customers.

investigating

We are continuing to investigate. This affects only QuickBooks customers.

investigating

It appears that nightly backups for some of our QuickBooks Online customers did not occur last night. We are currently investigating. This affects only QuickBooks customers.

Report: "Backups for Miro are Degraded"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Degraded Mailchimp Backups"

Last update
resolved

This incident has been resolved.

monitoring

As a temporary fix, we have disabled backups for Mailchimp landing page content objects. All other objects are being backed up successfully.

Report: "Shopify Backups slower than usual"

Last update
resolved

This incident has been resolved.

monitoring

The migration continues. We have taken steps to reduce the slowness, although some backups are still taking longer than usual. We continue to monitor the situation.

monitoring

As we are migrating our Shopify customers to a new Shopify API endpoint, we have noticed that some of our nightly backups (<2%) are slower than usual. This is a temporary situation and we are monitoring very carefully.

Report: "Degraded restore experience in US east region."

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We're investigating restores getting stuck for the us east region. Our team is on it!

Report: "Seeing Performance Issues on Shopify Staging"

Last update
resolved

We have resolved the issues affecting performance.

investigating

We are currently investigating this issue.

Report: "Increased error alerts on the rewind administration page"

Last update
resolved

This incident has been resolved.

identified

We have identified the root cause of the issue and are currently deploying a fix.

investigating

We are currently experiencing intermittent slow downs on app.rewind.com. We are currently investigating and will update with more details.

Report: "Issue with subscriptions page"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating an issue with our subscriptions page.

Report: "Issues with Advanced Restores for Shopify & BigCommerce"

Last update
resolved

This incident has been resolved.

investigating

Investigating issues with Advanced Restores on Shopify and BigCommerce

Report: "Versions of backed up items are not visible in the Rewind web application"

Last update
resolved

The Rewind web application is not displaying versions of backed up items for some platforms. Backups continue to run successfully and are not affected.

Report: "Delayed backups for some customers in US region"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We're investigating delayed backups for some customers in US region. Our team is on it!

Report: "Slow backups in 'eu-central-1' region"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We identified delays in the backups running in 'eu-central-1' region. We fixed the issue and monitoring the results.

Report: "Nightly Rewind backups not being scheduled for all platforms"

Last update
resolved

This incident is resolved and nightly backups now resume as normal.

monitoring

A fix is implemented and we are monitoring the results

identified

The issue has been identified and a fix is being implemented.

investigating

We're investigating a large number of nightly rewind backups not being scheduled. Our team is on it!

Report: "Performance slowdown on the app and some backups"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating a performance issue with our core internal api service.

Report: "Legacy Backhub Customers Audit Log Missing History"

Last update
postmortem

# BackHub audit log data loss incident on March 14, 2024 ## Summary On March 14th 2024 during a live Disaster Recovery \(DR\) test for the BackHub system, the data store for the live audit log was deleted in error. While investigating what had happened, it was discovered that no backup existed of the audit log data store and all audit log records for the BackHub infrastructure was lost. ## What Happened We perform bi-annual DR testing of all our application services and infrastructure. During this testing, we discovered that for the BackHub infrastructure \(which handles backups of Github\), we had not previously done a full AWS regional failure DR test and proceeded to do so. While the DR testing was successful and resulted in good updates to our DR procedures, the teardown of the DR testing region revealed a problem with one of the data stores used by BackHub. BackHub uses three main data stores for permanent storage: * For backup metadata and customer information, an AWS RDS database is utilized * For the actual Github backups themselves, AWS EBS storage is utilized * For the audit logging \(backup complete, backup failed, restore complete, etc.\), AWS SimpleDB is utilized All of the infrastructure for the environments are managed using Infrastructure-as-code, specifically with Terraform. ### SimpleDB SimpleDB is an AWS service that launched in 2007 as a ‘simple' NoSQL data store. It has since been superseded by DynamoDB as the main NoSQL offering from AWS and indeed is no longer accessible in the AWS console. Within Rewind, the BackHub infrastructure is the only component using SimpleDB as a data store. It is this service which is used to hold the audit log records for BackHub Within the BackHub infrastructure, the backup services are deployed in multiple regions for data residency purposes. However, the core ‘administration’ database is hosted in a single region with snapshots replicated for DR purposes. Within Terraform, the configuration for SimpleDB looks like this: ```json module "simpledb" { source = "../modules/simpledb" providers = { aws = aws.eu-west-1 } } ``` Meaning, whatever provider is used for the backup services, SimpleDB will always be referenced in the eu-west-1 region. Usually when performing a _terraform apply_ for a pre-existing resource, the apply will fail and the resource must be imported into the terraform state. However, that is not the case with SimpleDB - the resource is added to the existing state with no conflicts. For someone not familiar with the fine details of the Terraform template, it appears as if a new SimpleDB instance has been created in the newly configured DR region. The issue then comes when running a terraform destroy operation to remove the DR testing infrastructure - the SimpleDB instance in eu-west-1 is destroyed rather than what the operator expects which is the DR copy of the SimpleDB instance. ### SimpleDB Backups Rewind has extensive policies and procedures around backup and restore of critical infrastructure and data stores and this is regularly tested bi-annually in a process known as the “restore-a-thon”. During this process, we verify we can restore everything we have backed up - both in the same region and a replica region within AWS. However, despite all of this we found we had no backup of the SimpleDB database being used for audit logging and hence no restoration of the now destroyed database was possible. We also looked at the possibility of re-creating the audit log database from regular log messages emitted by BackHub but found that the log messages do not contain enough information to reconstruct the audit log records. ## Lessons Learned and Actions We are applying the following learnings and actions from this incident: * SimpleDB has no built-in backup and restore process * Action: we will create our own tooling to facilitate this * While we have guard rails in place to prevent deletion of data stores, SimpleDB was not in this policy * Action: We have added SimpleDB to the guardrails around deletion * SimpleDB backup and restore testing should be performed at the same interval as other data stores * Action: SimpleDB is being added to the regular backup and restore testing process * All data stores should be re-audited for backup and restore capabilities and procedures * Action: All data stores are being re-audited to ensure full backup and restore processes and procedures exist

resolved

We have identified and remediated the problem with the audit log for the Backhub system. A postmortem will be published for this incident.

identified

We have identified the issue with the Backhub audit log and are considering mitigations

investigating

We are currently investigating an issue where some customers are missing audit log data in the Backhub product.

Report: "Elevated error rates for Rewind backups/restores/copy for our Shopify, BigCommerce, Trello, QBO platforms"

Last update
resolved

This incident has been resolved.

identified

We have identified an issue affecting several of our platforms. We are working on a fix.

Report: "Backups delayed for some customers"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Backups delayed for some customers"

Last update
resolved

This incident has been resolved.

identified

We've identified an issue affecting backup performance for a subset of customers. Our team is on it!

Report: "Delay in marking backups as complete for Confluence, Jira, Bitbucket in the European region"

Last update
resolved

This incident has been resolved.

identified

We're experiencing delays in marking backups as complete for the Platforms - Confluence, Jira and Bitbucket in the European region. Our team has identified the problem and is working to resolve it.

Report: "Rewind Webapp search issues for Rewind backups"

Last update
resolved

This incident has been resolved.

investigating

We're investigating rewind app vault search issues for some customers. Our team is on it!

Report: "Delayed backups for the BigCommerce platform"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We're investigating an issue affecting a subset of our BigCommerce backup customers. Our team is on it!

Report: "Exports experiencing issues"

Last update
resolved

This incident has been resolved.

identified

We have identified an issue with account exports and are currently working on a fix.

Report: "Jira, Confluence, Bitbucket, Github backups delayed in European region"

Last update
resolved

Backups are operating normally. Thank you for your patience.

identified

We've identified an issue affecting backup performance in our European region for Jira, Confluence, Bitbucket, and Github.

Report: "Elevated error rates for Rewind Staging on Shopify"

Last update
resolved

This incident has been resolved.

identified

We've identified the issue and are working on a fix Our team is on it!

Report: "Elevated error rates for Rewind webhooks for all platforms"

Last update
resolved

This incident has been resolved.

monitoring

We are beginning to see webhooks being processed. We are continuing to monitor the situation.

identified

We are continuing to work on a fix for this issue.

identified

We're investigating elevated error rates for Rewind webhooks for all platforms. Our team is on it! Backups and restores are functional, however near-real-time backups are currently delayed. Shopify Staging one-time copies are functional, however continuous copies are currently delayed.

Report: "JIRA and Confluence backups delayed in Australia region"

Last update
resolved

Thank you for your patience, JIRA and Confluence backups have resumed.

investigating

We are currently investigating an issue with JIRA and Confluence backups in the Australia region.

Report: "Sign-in to the Rewind App with Intuit is currently experiencing an issue."

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating the issue. Users are still able to sign in with their email login in the meantime.

Report: "Elevated error rates for the Rewind web application"

Last update
resolved

This incident has been resolved.

identified

We are investigating possible issues with the Rewind web application. Don't fret though, your backups are still running!

Report: "Elevated error rates for the Rewind web application"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are investigating possible issues with the Rewind web application. Don't fret though, your backups are still running!

Report: "Jira Backups experiencing issues"

Last update
resolved

This incident has been resolved.

investigating

We're currently investigating an issue with our Jira backups Vault view

Report: "Elevated error rates for Rewind backups for the Trello platform"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We're investigating elevated error rates for Rewind backups for the Trello platform. Our team is on it!

Report: "Shopify Copy experiencing issues"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating the issues with copies for Shopify Copy.

Report: "Shopify Copy experiencing issues"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue for Shopify Copy. BigCommerce Copy is back to operational.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating the issues with BigCommerce Copy and Shopify Copy.

Report: "Elevated error rates for Rewind backups on the Shopify platform"

Last update
resolved

This incident has been resolved.

identified

A fix for the issue has been applied. Degraded performance may persist for a short time while systems stabilize.

identified

We have identified the issue and are working on a fix.

identified

We're investigating elevated error rates for Rewind backups on the Shopify platform.

Report: "Delayed backups on the BigCommerce backup in the US region"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating an issue with Rewind Backups for BigCommerce in the US region.

Report: "Elevated error rates for Rewind backups on the Trello platform"

Last update
resolved

This incident has been resolved.

identified

Our engineers have identified the root cause of the issue and are working on a fix.

identified

We've identified an issue that caused backups for the Trello platform in the US region to not be performed as scheduled on the evening of Sept 8th. Our team has diagnosed the issue and is working on a fix.

Report: "Service interruption on Intuit nightly backups"

Last update
resolved

This incident has been resolved.

identified

Issue has been identified and backups are in progress for all accounts that were impacted by this service interruption. We will post an update once this is complete.

investigating

We are experiencing a service interruption to nightly backups for Intuit; action is underway to resume service, we will post an update as soon as one is available.

Report: "Delayed backups on the Shopify backup in the US region"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating an issue with Rewind Backups for Shopify in the US region.

Report: "Delayed backups on the Shopify backup in the US region"

Last update
resolved

This incident has been resolved.

identified

We are continuing to bring service back online to full capacity.

investigating

We are currently investigating an issue with Rewind Backups for Shopify in the US region.

Report: "Backups for Github unavailable"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating an issue with backups for Github in the AWS EU region

Report: "Elevated error rates for the Rewind web application"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are investigating possible issues with the Rewind web application. Don't fret though, your backups are still running!

Report: "Elevated error rates for Rewind backups on the BigCommerce platform in the US region"

Last update
resolved

This incident has been resolved.

monitoring

Capacity has been increased in the US region and we're seeing much improved processing rates. Near-realtime backups may be delayed by a few minutes

investigating

We're investigating elevated error rates for Rewind backups on the BigCommerce platform in the US region. Our team is on it!

Report: "Vault view not loading in recent versions of Chrome from within the BigCommerce Control Panel"

Last update
resolved

This incident has been resolved.

identified

We are continuing to work on a fix for this issue.

identified

Rewind engineers have identified the issue and are working towards remediation. In the interim, please consider using Firefox, Safari, or logging in directly from https://app.rewind.io

Report: "Webapp connectivity"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating an issue with the Rewind Web Application.

Report: "The Rewind web application is experiencing intermittent connectivity errors"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Elevated error rates for Rewind backups on the Shopify platform in the US region"

Last update
resolved

This incident has been resolved and all backup and restore services are functioning normally

identified

We have identified the issue and engineers are applying a mitigation

investigating

We are currently investigating this issue.

Report: "Elevated error rates for Rewind backups on the Shopify platform in the US region"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results

investigating

We're investigating elevated error rates for Rewind backups on the Shopify platform in the US region. Our team is on it!

Report: "app.rewind.io is not loading"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.