Snapsheet

Is Snapsheet Down Right Now? Check if there is a current outage ongoing.

Snapsheet is currently Operational

Last checked from Snapsheet's official status page

Historical record of incidents for Snapsheet

Report: "OFAC API"

Last update
resolved

This incident has been resolved. If you experience any further issues, please contact Snapsheet support.

identified

OFAC is continuing to work on a fix for this issue.

identified

OFAC is experiencing an outage with the OFAC API from trade.gov. They are experiencing high latency on their platform and currently looking into it. Payments created with the feature on will be put in Compliance Hold until a user approves them.

Report: "Data Warehouse Delays"

Last update
resolved

This issue has been resolved. Please contact Snapsheet Support if you experience any further issues.

monitoring

The delays in replication have been resolved and we are continuing to monitor.

identified

We are seeing normal throughput for the data pipeline and expect the replication delays to be fully resolved shortly.

identified

We are continuing to work on addressing the replication delays and have engaged AWS support for Amazon's Data Migration Service where the delay is being experienced.

identified

We are currently seeing delays in replication through the data warehouse. We actively working on addressing the delay and will provide another update as soon as possible.

Report: "US UAT Navigation bar Issue"

Last update
resolved

This incident has been resolved. Please contact Snapsheet Support if you experience any further issues.

investigating

We are continuing to investigate the issue and will provide an update as soon as possible.

investigating

“We are investigating an issue with the navigation bar in US UAT and will provide an update as soon as possible”

Report: "Recent claims are unsearchable"

Last update
resolved

This incident has been resolved. Please contact Snapsheet Support if you experience any further issues.

identified

“We are currently seeing delays in claims being indexed for search. We are investigating and will provide updates as they become available.”

Report: "Processing Delays"

Last update
resolved

This incident has been resolved.

investigating

The processing delays have been resolved. We will continue to monitor performance and provide a post-mortem as soon as it becomes available.

investigating

We are currently experiencing processing delays with claim details showing up in the work queues.

Report: "Snapsheet Claims Outage"

Last update
postmortem

## Incident summary Between approximately 5:15 PM CT and 8:15 PM CT on July 30th, 2024, the US instance of Snapsheet Claims experienced degraded response times, resulting in timeouts and pages failing to load for users. The event was triggered by an [AWS outage](https://health.aws.amazon.com/health/status?eventID=arn:aws:health:us-east-1::event/MULTIPLE_SERVICES/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE_F0449_625BDDE4846) that affected 64 AWS services in the `us-east-1` region that started at approximately 5:00 PM CT, was first reported by AWS on their status page at 5:40 PM CT, and was marked as resolved by AWS at 11:55 PM CT. We apologize for the inconvenience this caused and are committed to continually improving the resiliency of the Snapsheet Claims platform. ## Timeline All times are CT. 5:10 PM: An automated alert indicated that response times were degraded for one of the backend services for Snapsheet Claims and engineering began investigating immediately 5:15 PM - 5:35 PM: Several support tickets were raised by clients indicating that they were experiencing issues loading pages within Snapsheet Claims 5:35 PM: Response times continued to escalate across two different Snapsheet Claims backend services and an incident was published on the Snapsheet status page 5:40 PM: AWS reported that an ongoing incident in the `us-east-1` region was impacting multiple services 6:00 PM: Snapsheet engaged directly with AWS resources to get more information on the impact and mitigation options for the incident 7:00 PM: Response times returned to normal for one of the two impacted Snapsheet Claims backend services 8:00 PM: Response times started improving for the remaining degraded Snapsheet Claims backend service 8:15 PM: Response times returned to normal for Snapsheet Claims and the incident was considered resolved after a period of monitoring ## Root Cause The event was triggered by an [AWS outage](https://health.aws.amazon.com/health/status?eventID=arn:aws:health:us-east-1::event/MULTIPLE_SERVICES/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE_F0449_625BDDE4846) that affected 64 AWS services in the `us-east-1` region that started at approximately 5:00 PM CT, was first reported by AWS on their status page at 5:40 PM CT, and was marked as resolved by AWS at 11:55 PM CT. AWS had an issue with the Kinesis service. Kinesis is a fully managed AWS service that enables real-time collection, processing, and analysis of streaming data at scale. Amazon CloudWatch experienced elevated error rates and latencies due to its dependency on the degraded Kinesis service. Amazon CloudWatch is a dependency across most Amazon services for logging and monitoring. This led to cascading failures across 64 AWS services as indicated in the [AWS outage](https://health.aws.amazon.com/health/status?eventID=arn:aws:health:us-east-1::event/MULTIPLE_SERVICES/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE/AWS_MULTIPLE_SERVICES_OPERATIONAL_ISSUE_F0449_625BDDE4846) details. Snapsheet does not use Amazon Kinesis directly but was impacted by the internal dependency that Amazon CloudWatch has on Kinesis. Two Snapsheet Claims backend services that leverage Amazon Elastic Load Balancer and Elastic Container Service were impacted despite being available across 6 different availability zones \(discrete data centers\). Amazon Elastic Load Balancer and Elastic Container Service were impacted due to their dependency on CloudWatch for logging. As we did not have direct visibility into the AWS issue, correspondence with AWS confirmed with their back-end tooling and logging that the two Snapsheet Claims backend services were impacted by the AWS outage. ## Preventative Measures All Snapsheet platform services are available across multiple availability zones within each region and are configured to automatically failover when a disruption occurs. Unfortunately, in this case, the AWS internal dependency on Amazon CloudWatch caused failures across all availability zones simultaneously. Snapsheet also has the ability to restore services across AWS regions. Due to the nature of this incident occurring on internal AWS service dependencies, Snapsheet did not have visibility into where the issue was coming from until Amazon provided additional information which made it difficult to determine if we should start rotating certain services to a different region. Snapsheet will be working with AWS as additional details of their root cause analysis become available and will be investigating multiple options for preventing and mitigating similar issues in the future.

resolved

This incident has been resolved. We will continue to monitor performance and provide a post-mortem as soon as possible.

monitoring

We are seeing improved response times and normal performance within Snapsheet Claims over the past half hour. We are still working with AWS until they confirm that the issue is resolved and to produce a full post-mortem for the issue.

identified

We are remaining engaged with the AWS team as they continue to address the widespread AWS outage. They have identified the root cause and are actively working on multiple parallel paths to mitigate the issue. We will provide updates as they become available.

investigating

We are continuing to investigate, but the issue appears to be caused by the AWS outage. We are working with the AWS team to get more information and will provide updates as they become available.

investigating

We have identified that one of our underlying vendors, AWS, is reporting issues across multiple services: https://health.aws.amazon.com/health/status. We are continuing to investigate and will continue to provide updates as they become available.

investigating

We are currently experiencing issues with accessing Snapsheet Claims. We are investigating urgently and will provide an update as soon as possible.

Report: "Metrics Showing Internal Error"

Last update
resolved

This incident has been resolved.

investigating

Incident Resolved

investigating

We are currently experiencing an issue resulting in metrics dashboards not loading. We are investigating and will provide another update as soon as possible.

Report: "Data Processing Delays - Reporting Tools Affected"

Last update
resolved

The scheduled migrations have concluded.

monitoring

On Saturday, April 27th, starting at approximately 2 am Central Standard Time, scheduled data migrations will take place, which may result in temporary delays in the data pipeline.

Report: "Twilio SMS Incident"

Last update
resolved

Twilio Reported: All services impacted by the issues with the Twilio Rest API have recovered and are operating as intended. This incident has been resolved.

monitoring

Twilio: We're seeing recovery in the Twilio Rest API, impacting multiple Twilio services. We will be monitoring all services to ensure a full recovery and provide another update in 30 minutes or as soon as more information becomes available.

monitoring

Twilio reported that they have identified and are working to fix an issue around SMS. We are monitoring the incident on our end and will post any updates here. Twilio Service Status: https://status.twilio.com/

Report: "Increased SMS Error Rates"

Last update
resolved

This incident has been resolved.

monitoring

Our vendor has reported an issue and is currently seeing elevated error rates, causing SMS delivery delays. We will continue to monitor the issue and provide updates as they become available.

Report: "Increased SMS Error Rates"

Last update
resolved

Our SMS provider has updated us stating this has been resolved. We will continue to monitor and provide any updates if there is a change.

monitoring

We continue to see a small number of SMS delivery failures. We are continuing to engage with our SMS vendor to determine the resolution.

monitoring

We continue to work with our vendor, who has identified the issue and pushed a fix to resolve the issues with SMS delivery. We will continue to monitor the situation and provide any updates as needed.

investigating

We are seeing an increased error rate for SMS delivery. We are currently investigating and will provide additional updates once available.

Report: "Informational Announcement"

Last update
resolved

This incident has been resolved.

monitoring

We are monitoring an ongoing AWS outage. Currently, none of our services are affected. We are posting this to raise awareness. We will continue to monitor and share updates as they become available.

Report: "Issues with Search Functionality"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented, and we are monitoring the results. The system appears to be operational.

identified

We have identified an issue with the search functionality and are currently looking into a resolution.

Report: "Issue when trying to load photos and documents within a claim (AWS)"

Last update
resolved

Our workaround has resolved this issue, and we are continuing to work with AWS for an RCA

monitoring

We’ve rolled out a workaround to mitigate this issue as we are still working with AWS.

identified

We have identified and are currently testing a workaround. We are still working with AWS to obtain an RCA and resolution to the original issue. We will continue to provide updates as they become available.

investigating

We are continuing to investigate this issue.

investigating

We spoke with a network specialist at AWS who will be working with his team to bring all eyes in AWS on this issue as we continue to try and identify the source of the disruption we are seeing when accessing files in vice-backend production. They will continue the investigation later tonight and also tomorrow morning.

investigating

We've found more information that suggests the problem is in the AWS network handling some IPv6 requests to files in vice-backend production. The AWS S3 team is helping to investigate. No ETA at this time.

investigating

The Snapsheet Team is still troubleshooting with AWS to resolve the issue. AWS has escalated the issue, and we are awaiting feedback from their Engineers.

investigating

Some users are unable to load photos and documents within a claim. We are currently investigating and following up with AWS.

Report: "SmartCommunications/AWS Issue"

Last update
resolved

This incident has been resolved.

monitoring

Summary of Issue: - Time/Date Issue Identified --> Thursday 7/28/22 at approximately 1pm Central Time - Description of Issue --> Users are unable to access the SmartCommunications Draft Editor and are receiving a loading error message. - Next Steps --> This issue was caused by an AWS outage. Snapsheet and SmartCommunications are actively addressing this matter. SmartCommunications informed Snapsheet that the issue is being fixed, there has been improvement, and that the issue could be fully resolved in about 1 hour. Snapsheet will share another update once new information is available. We're terribly sorry for the issue. - Link Regarding AWS Outage --> https://seekingalpha.com/news/3862592-amazon-investigating-aws-outage-in-us-east-region?utm_source=feed_news_all&utm_medium=referralhttps://seekingalpha.com/news/3862592-amazon-investigating-aws-outage-in-us-east-region?utm_source=feed_news_all&utm_medium=referral - Resolution Reached --> This issue was fully resolved at approximately 5pm Central Time

Report: "Details Not Appearing in Claim File's "History" Section"

Last update
resolved

This incident has been resolved.

investigating

Summary of Issue: Time/Date Issue Identified --> Thursday 6/2 at 7:53am CT Description of Issue --> Upon visiting a claim file's "History" section, details aren't appearing in the default "Claim History" view. Workaround --> If you visit the exposure's "History" section, you could see the entries for notes, etc. Next Steps --> Snapsheet is actively investigating this matter and we'll share another update ASAP. We're terribly sorry for the issue. Resolution Reached --> At approximately 12:40pm CT the fix was deployed to the impacted environments.

Report: "Photos/Documents Not Appearing in "Photos and Documents" Section of Claim File"

Last update
resolved

This incident has been resolved.

identified

Summary of Issue: Time/Date Issue Identified --> On Thursday 5/5 at 9:18am CT Description of Issue --> Upon initially visiting the "Photos and Documents" section of the claim file, photos/documents aren't appearing in the default view. Workaround --> In the "Photos and Documents" section, if you change the view by clicking on the square icon that is below the "Download" button and off to the right, you should be able to view the photos/docs. Also, photos/docs could potentially be available in the exposure within the "Photos and Docs" tab. Next Steps --> Snapsheet is preparing a fix and planning on deploying the fix, today. Resolution Reached --> At approximately 4:40pm CT the fix was deployed to the impacted environments.

Report: "Snapsheet Claims - Slower Performance"

Last update
resolved

This issue has been resolved and we're terribly sorry for the issue.

monitoring

Snapsheet deployed a fix and is monitoring the situation.

identified

Summary of Issue - Time/Date Issue Identified --> Tuesday 2/8 at 8:09am CT - Description of Issue --> When trying to access Snapsheet Claims, some users may encounter a blank screen and/or the page(s) load slowly. - Next Steps --> Snapsheet is actively addressing this matter and is preparing a fix for the issue. Snapsheet will provide updates as new details become available.

Report: "Issue Identified - AWS Outage"

Last update
resolved

Summary of Issue: - Time/Date Issue Identified --> On Tuesday 12/7/21 at 10:30am CT - Description of Issue --> Snapsheet became aware that AWS experienced an outage which in turn caused an issue with the Snapsheet Claims Management System. Pages are loading slowly and some users may not be able to access the Claims Management System. - Next Steps --> Snapsheet is actively monitoring this situation and will post subsequent updates as new information becomes available. AWS hasn't shared when this issue will be resolved, yet. Resolution Reached: - The morning of 12/8/21, AWS announced the issue was fully resolved.