Historical record of incidents for Spruce Health
Report: "Call transfer to extension failures"
Last updateWe have identified and fixed an issue with transferring calls to internal extensions where the call would be dropped. The call transfer failures began around 4:50 A.M. PDT and resolved around 7:50 A.M. PDT. These errors only occurred for transfers to extensions. Transfers to full 10 digit numbers continued to work properly. We are continuing to monitor the situation and expect transfers to extensions to be working correctly now.
Report: "Delivery Failures for Outbound SMS, Verification Codes & Invites"
Last updateAs of 4:00 AM PT on June 5, 2025, we began observing issues with outbound SMS delivery affecting a subset of customers. This disruption is also impacting the delivery of verification codes required for account creation and login, and may be preventing the successful sending of secure messaging invites. Our engineering team is actively investigating the root cause. We will share updates here as we learn more.
Report: "[Resolved] SMS Delivery Failures from AT&T Network to a subset of Spruce Phone Numbers"
Last updateThe system is operating as normal. Our telecommunications infrastructure partner has informed us that the issue with inbound SMS messages from the AT&T Network to a subset of Spruce phone numbers has been resolved. It began March 28, 2025 around 8:45 PT and was resolved March 31, 2025 around 1:00 PT. We will continue to monitor.
We are observing successful SMS delivery when sending messages from the AT&T Network. We will continue to monitor to ensure full service recovery.
The issue has been acknowledged by our telecommunications infrastructure partner and they are actively collaborating with AT&T on resolving this issue.
Our telecommunications infrastructure partner has confirmed an issue in receiving SMS from AT&T mobile users in some cases. They are actively collaborating with AT&T to resolve this issue. This issue started on March 28, 2025 around 8:45 PT and is ongoing.
Report: "Intermittent call failures"
Last updateOur partner has resolved the issue. All inbound and outbound phone calls are expected to operate normally.
Our telecommunications provider has implemented a fix, and we started seeing improvements around 9:45am PT. We are continuing to monitor the situation.
Our telecommunications partner is experiencing issues with call routing. This has resulted in the inability to make or receive calls for some Spruce users. This issue started on March 31, 2025 around 8:10 A.M. PT and is ongoing.
Report: "[RESOLVED] Spruce inbox inaccessible; inbound calls impacted"
Last updateThe Spruce application was inaccessible across desktop, web, iOS, and Android from 11:48 AM PT to 12:02 PM PT on March 11, 2025 due to an unexpected database issue. User Impact during this time: - The Spruce Inbox was unavailable to patients and providers - Outbound calls, SMS, video calls, and secure messages failed - Inbound calls failed with an error message to the caller. Most call events were delivered to the Inbox with a 10-15 minute delay, but a few were not and are being manually addressed - Inbound SMS and faxes were delayed by a 10-15 minutes We are working with our cloud infrastructure provider to determine the root cause of the database issue and exploring ways to prevent similar incidents in the future.
The platform has recovered and we are continuing to monitor.
We are investigating an issue with unavailability of the Spruce platform
Report: "[RESOLVED] Users unable to reach Spruce phone numbers with area code 317"
Last updateWe have more reports coming in of successful call routing to Spruce phone numbers with 317 area code. We're considering this incident resolved. If you continue to experience issues with callers unable to reach your 317 area code Spruce phone numbers, please don't hesitate to reach out to us at support@sprucehealth.com or via the Spruce Support conversation in-app.
We've been working closely with our telecommunication infrastructure provider to get this issue resolved. As of 1:35pm PT today (February 18), we have started to see recovery with multiple practices confirming that calls that were previously are successfully routing to their Spruce phone number. We're continuing to monitor the situation to ensure it is fully resolved before considering this incident closed. We're also working with our telecommunications partner to understand the true root cause of the issue here.
We're continuing to work with our telecommunications partner to get to the bottom of the call issues here. We'll provide an update tomorrow (February 18) once we have an update from them.
We have been receiving multiple reports of users unable to reach Spruce phone numbers with area code 317. When dialing a 317-based Spruce phone number, the caller hears an automated message such as "Your call cannot be completed at this time." We received the first report of this issue on February 13. The impact of this issue is as follows: - Some providers are unable to make outbound calls via their Spruce phone number - Some patients are unable to reach Spruce phone numbers with are a code 317 If you're experiencing any such outbound/inbound call issues, please don't hesitate to reach out in-app via Spruce Support or email us at support@sprucehealth.com We're actively working with our telecommunication infrastructure provider to get to bottom of this issue.
Report: "Contacts not loading, search functional"
Last updateThe issue preventing contact lists from loading for providers on Spruce has been resolved. Contacts are now loading as expected, and search remains fully functional. We appreciate your patience while we worked on the fix. If you continue to experience issues with contacts loading, please don't hesitate to reach out over the Support thread in-app or via email at support@sprucehealth.com.
We have identified an issue where contact lists are not loading for providers on Spruce. You can still search for all your contacts. We are working to resolve the issue but it will take us a few hours to put a fix in place. We will update this page as soon as we have a fix in place and have confirmed that contacts load.
Report: "[RESOLVED] SMS Deliverability issues impacting outbound SMS and verification codes"
Last updateIf you continue to experience deliverability issues with SMS, please don't hesitate to reach out over the Support thread in-app or via email at support@sprucehealth.com.
The issue here was with our telecommunications infrastructure partner that was experiencing delivery failures to a subset of phone numbers across multiple networks. They have reported that the issue is now resolved and they are seeing successful delivery of SMS messages to networks. We're continuing to work closely with them to ensure that the issue is truly resolved. If you continue to experience deliverability issues with SMS, please don't hesitate to reach out over the Support thread in-app or via email at support@sprucehealth.com.
We are investigating an issue with outbound SMS, including verification codes. We will keep this ticket up to date as we learn more.
Report: "[RESOLVED] Calls and SMS impacted"
Last updateThe system is operating as normal. This seems to have been an intermittent underlying issue with our telecom infrastructure provider which we are continuing to investigate and requesting updates on.
The issue was temporary and lasted from 10:12 am PT to 10:21 am PT. During this time, inbound and outbound calls were impacted for some organizations where inbound calls likely failed and a corresponding failed call events were posted. Outbound calls were also likely impacted for some organizations. Any inbound/outbound SMS, if delayed were eventually processed. We are continuing to monitor the situation to ensure that the platform is functioning normally. We are also continuing to investigate what caused the issue and how we can improve it for the future.
As of 10:12am PT, we are seeing calls and SMS impacted where inbound and outbound calls may be failing and SMS may be delayed. We're actively investigating this issue.
Report: "Stale search results when searching for contacts, conversations or messages"
Last updateSearch latency is back to a normal state as of 2am PT / 5 am ET October 25, 2024. We have a good understanding of what happened here and are looking into ways we can reduce the likelihood of a similar situation in the future. Customer impact was from ~12pm PT / 3pm ET on October 24, 2024 to 2am PT / 5 am ET where search results were stale for conversation, contact and contact list querying. There was no impact to the Spruce inbox, exchanging of calls, SMS, secure messaging, email, video calls or fax on the platform.
We have identified the issue and are working through a plan to fully resolve the issue. The latency for updates in the system to show up in the search results is ~15 minutes. We're working to see how we can improve this and fully resolve the issue.
We're seeing elevated latencies in indexing contacts, conversations and messages for the purpose of searching. This is resulting in potentially stale search results being returned when searching for contacts, conversations and messages or even when loading contact lists. This issue started around 12pm PT / 3pm ET and is ongoing. We're actively investigating the issue and will post updates here as soon as we have them. There is no impact to loading the Spruce inbox, making/receiving calls, exchanging SMS, email or fax on Spruce.
Report: "Spruce experiencing issues with calls, SMS and loading inbox."
Last updateThis incident has been resolved.
This issue has since been resolved but we are monitoring the situation as everything returns to normal.
We identified an issue causing unavailability of SMS, calls, fax and loading of the inbox. We are investigating a resolution.
Report: "[RESOLVED] Messages delayed to Spruce Inbox"
Last updateThe system has recovered after the increase capacity. There should no longer be a delay in messages showing up in the Spruce inbox. The customer impact was from 07:20 PT - 10:40 PT on Sept 3 2024.
We have increased the capacity of the system to reduce the delay in messages showing up in the Spruce inbox. The increased capacity has resulted in recovery where messages are now showing up in the inbox in a timely fashion. We're continuing to monitor the system to ensure normal operations while also leaving the increased capacity in place.
Some providers and patients may be experiencing a delay in messages showing up in the Spruce inbox. There is no impact to making/receiving calls, video calls, or sending messages.
Report: "[RESOLVED] Call transfers from VoIP desk phones not working"
Last updateThe root cause of this outage was the deployment of new code that attempted to manage the transferring of calls to international numbers. The code incorrectly made the assumption that the “to” identifier for the transfer was always a phone number and attempted to parse it as such. In reality this “to” information can also be SIP and Client identifiers. Failing to parse the numbers and treating that failure as a hard failure for the call resulted in the transfer flows failing.
From 5:03 AM PT to 7:56 AM PT on August 7, 2024, call transfers from VoIP desk phones failed to complete and resulted in an error being reported to the user initiating the call transfer. No other inbound/outbound calls were impacted during this time. The rest of the platform continued to operate as intended.
Report: "Spruce inbox failing to load"
Last updateThe system is operating normally and the inbox is accessible now. There was no impact to inbound calls, fax or SMS. As the Spruce app was inaccessible, patients and providers likely could not send a message, engage in a video call or make an outbound call.
The system is operating normally and the inbox is accessible now. There was no impact to inbound calls, fax or SMS. As the Spruce app was inaccessible, patients and providers likely could not send a message, engage in a video call or make an outbound call. Impact was from 11:39am PT - 11:51am PT on May 1 2024.
We started to see recovery as of 11:50am PT where the inbox was accessible for users. We're continuing to monitor the situation to ensure that the platform remains operational.
We have identified the issue and are working to resolve the issue as we speak. Inbound calls are not impacted. Outbound calls may be impacted.
We are investigating an issue with the Spruce inbox failing to load.
Report: "Outbound and Inbound calls not connecting on Spruce"
Last updateOur telecommunications infrastructure provider has confirmed that the system is fully operational now. Inbound and outbound calls for many customers were impacted between 7:15am PT and 10:30 am PT on April 22, 2024. All call events, including any voicemails, should be in the Spruce inbox. We apologize for the disruption this issue may have caused to your practice operations. We're working closely with our telecommunications provider to identify the root cause of the issue and engage with them in ways to reduce the likelihood of such an issue from occurring again.
Our telecommunications infrastructure provider identified the problem to be a scaling issue. They have scaled their resources appropriately and are starting to see recovery on their end. In our testing, inbound and outbound calls seem to be working now. It will likely take a few more minutes for inbound and outbound calls to work for all customers. We'll continue to monitor the situation and post an update here.
Our telecommunications infrastructure provider has acknowledged that this is an issue on their end. Their engineering team is actively looking into the issue. We are in close contact with them and will continue to update this incident page as soon as we have another update to share.
We are receiving multiple reports of practices being unable to hear the other party on inbound and outbound calls. We are actively investigating the issue and will keep this incident page updated as we have updates to share.
Report: "Users on Comcast/Xfinity reporting issues connecting to Spruce"
Last updateThe external network connectivity issues have been resolved. The service should now be fully accessible for everyone.
We are receiving reports of users on Comcast/Xfinity's internet service being unable to connect to Spruce. The Spruce platform remains operational and this seems to isolated to users on Comcast/Xfinity. If you are on Comcast/Xfinity, please try connecting through a different internet service provider to reach Spruce. You could try the mobile apps via a cellular network connection in the mean time. We are actively monitoring the situation and ensuring that the impact is not more widespread. We are also looking for an official report on this to ensure that our assessment is correct.
Report: "AT&T outage impacting calls and SMS on Spruce"
Last updateAT&T is reporting that, as of 3pm ET, service has been restored to all impacted users. Spruce services should be operating normally for patients and providers at this point. If you are experiencing any issues, please don't hesitate to reach us at support@sprucehealth.com.
AT&T is currently experiencing an outage which is having the following impact. Providers with AT&T as their cell phone service provider may experience the following: - Unable to make calls from the Spruce application - Unable to receive verification codes to log in to Spruce All providers may experience the following: - Callers using AT&T unable to call Spruce phone numbers - Texters using AT&T unable to SMS Spruce phone numbers - Providers calling patients with AT&T unable to receive calls The following services are NOT impacted: - Exchanging secure messages, fax and email via Spruce - Video calling on Spruce
Report: "Video calls not connecting"
Last updateVideo calls should now be functioning on Spruce. The issue was with our underlying video infrastructure provider. We are still gathering the details on what went wrong on their end and will post more details as a postmortem to this incident page. We're sorry for the inconvenience this incident caused to your business.
A fix has been implemented and we are seeing the issue being resolved. We will continue to monitor to ensure video calls are fully operational.
We have engaged our video infrastructure provider to help investigate. We have been able to reproduce this issue on our end. We're actively working on this issue and will update this page as soon as we have an update to share.
We are investigating an issue with video calls where patients are unable to join a video call with their provider.
Report: "[RESOLVED] Search & Contact Filters experiencing intermittent errors and slowness"
Last updateThe platform is fully operational. Customer impact was from 1:08pm PT to 3:14pm PT.
The system should be fully operational now and search results should no longer be stale. We will continue to actively monitor the platform to ensure smooth operations.
Users may be experiencing degraded performance for the following actions: - Searching for contacts or conversations: may be giving errors, returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results Phone calls, video calls, exchanging messages (fax, secure, email or SMS) is NOT impacted and operating normally. We are actively working on resolving this issue and will provide an update soon.
Report: "Delayed Sending of Outbound Messages"
Last updateOutbound SMS is fully functional at this point, along with the rest of the services.
A fix has been implemented and we are monitoring the results.
An issue with our outbound message processing caused SMS, fax, and email messages to be delayed during the period of this event. We have since addressed the issue and outbound messaging is behaving normally again. No other operations were impacted during this event.
Report: "[RESOLVED] Inbound SMS delayed"
Last updateThe issue has been resolved and all systems are operating normally. We're sorry for the inconvenience caused here, and are looking into ways to detect such a failure early on in the deployment process.
All inbound SMS from 11:21am PT to 1:53pm PT on November 7 2023 was delivered to the Spruce inbox in a delayed manner. There was no impact to outbound SMS or to any other aspect of the Spruce platform. All inbound SMS that were delayed had an indicator in the message itself to indicate how long they were delayed by (the time between when the SMS was received by our telecommunications carrier partner and when the SMS was delivered to the Spruce inbox).
Report: "Intermittent failures"
Last updateWhile deploying changes we were hitting rate limits with our underlying infrastructure provider. This was resulting in intermittent errors while using the apps, but most functionality including calls/sms/fax continued to work without any known problems. We now have higher limits to avoid this happening again in the future.
We are currently investigating reports of intermittent failures affecting the web and mobile apps.
Report: "Partial MMS/SMS message delivery failures"
Last updateThe issues involving SMS/MMS have been resolved and all messaging should be returned to normal.
Our upstream infrastructure partner has reported that the underlying issue causing delivery failures to Verizon and AT&T networks is resolving and successful sends of SMS/MMS messages should be returning to normal rates.
We are experiencing MMS/SMS message delivery failures to Verizon and AT&T networks in the US for a subset of phone numbers, due to incorrect message filtering (blocking) by an upstream infrastructure partner. We are working on the problem and will update regularly.
Report: "Search & Contact Filters experiencing intermittent errors and slowness"
Last updateWe identified the reason for performance degradation to be an imbalance of data spread across the cluster. We have since resolved this issue and ensured that all stale search results have been cleared. The system is fully operational at this point.
Users may be experiencing degraded performance for the following actions: - Searching for contacts or conversations: may be giving errors, returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results Phone calls, video calls, exchanging messages (fax, secure, email or SMS) is NOT impacted and operating normally.
Report: "Issues with inbound and outbound fax delivery."
Last updateThe issue with faxes has been resolved.
Our fax provider is currently experiencing issues with inbound and outbound faxes. We will continue to monitor and update as more information is available from https://status.phaxio.com
Report: "Degraded SMS functionality"
Last updateOur telecommunications provider confirmed all systems to be operational at 7:44pm. During this incident, outbound SMS from Spruce was delayed in some instances from 5pm PT to 5:50pm PT. Users may not have been able to log in to Spruce if verification code was required between 5pm PT and 5:50pm PT. All other aspects of Spruce were functional during this time.
While we are not seeing any issues with outbound SMS, our telecommunications provider continues to report degraded functionality. So we'll continue to monitor on our end.
Outbound SMS looks all caught up at this point with no delays in sending SMS. We'll continue to monitor to see if the issue persists and wait until we have received confirmation from our telecommunications provider that SMS is functional on their end.
Users may not be able to log in to Spruce if they are waiting on the verification code since SMS functionality is degraded. Our telecommunications provider is experiencing issues with SMS functionality which in turn is impacting Spruce's ability to send SMS.
We are seeing elevated errors on the system when attempting to send SMS from Spruce. Our telecommunications infrastructure provider is reporting an issue in the SMS functionality which corresponds to the timeline of errors on our end. Outbound SMS is definitely impacted given that we are seeing errors. All outbound SMS that was sent from the Spruce app will eventually be delivered when our telecommunications provider's systems are operational again. It is unclear if inbound SMS to Spruce is impacted. No other functionality on Spruce has been identified to be impacted at this point.
Report: "Call connectivity issues with Verizon"
Last updateThe issue was resolved around 7:45am PT on July 26 as per a statement from Verizon.
Some users may be experiencing call connectivity issues due to an outage with Verison wireless network operator. Users may get a "all circuits are busy" message when placing a call. All other parts of the Spruce system are functioning normally. We are closely monitoring the situation and will post updates here.
Report: "Delayed push notifications"
Last updateThis incident has been resolved.
AWS has marked their incident as resolved, and all seems to be operating normally on our end.
Push notifications have all caught up. We will keep this ticket open until we have confirmation from AWS that the issue is resolved. We are no longer seeing any delays in processing push notifications.
Push notifications are delayed due to an ongoing issue with our underlying infrastructure provider, Amazon Web Services (AWS). We have a workaround to resolve the delayed push notifications despite the ongoing AWS issue, and are working to deploy this workaround. Mobile apps may receive push notifications for inbound and outbound messages in a delayed manner. The web application may refresh the state of the inbox in a delayed manner.
Report: "Outbound SMS, email and fax delayed"
Last updateAll delayed outbound SMS, email and fax are now caught up and we are no longer experiencing a delay in sending messages. We will continue to monitor the system to ensure that the system is functioning smoothly.
We continue to experience delayed SMS, email and fax. We have identified the problem to be with our telephony infrastructure provider and are actively working with them to resolve the issue.
We're currently seeing delays in outbound SMS, email and fax. No impacts to inbound SMS, phone calls, video calls or secure messaging.
Report: "Call connectivity issues"
Last updateOur telecommunications infrastructure provider is seeing recovery on their end. Outbound and inbound calls should be fully functional at this point.
Some providers may be experiencing call connectivity issues due to failures on our underlying telecommunications provider's end. We are closely monitoring the situation and will post updates here.
Report: "Obi1032 and Obi1062 devices cannot make outbound calls"
Last updateThis incident has been resolved.
We have deployed a fix so that outbound calls from Obi1032 and Obi1062 devices now work. We are continuing to monitor the system to ensure there are no problems with outbound and inbound calls across all device types.
We have identified an issue where any medical practice using the Obi1032 and Obi1062 desk phones are unable to place outbound calls. They experience a busy dial tone when placing an inbound call. Inbound calls work fine. Inbound and outbound calls from the smartphone application and other desk phone models continue to operate normally. No other aspect of the system is impacted.
Report: "Search & Contact Filters experiencing intermittent errors and slowness"
Last updateSystem is fully operational now, with all search results having caught up. We're really sorry for the inconvenience caused here and have decided to operate with extra capacity to ensure that the system continues to operate normally moving forward.
Search results and contacts are now loading without any errors. We are working to get search results to reflect latest results. Users should no longer have any issues creating conversations, sending bulk messages or opening contact lists. Phone calls, video calls, exchanging messages (fax, secure, email or SMS) are NOT impacted and operating normally. Will post an update as soon as we have search results all caught up.
We continue to actively work on the issue. We will post another update in another hour here as we assess progress of the fix.
We have identified the reason for the degraded performance and are actively working on resolving the issue.
Users may be experiencing degraded performance for the following actions: - Searching for contacts or conversations: may be giving errors or returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results Phone calls, video calls, exchanging messages (fax, secure, email or SMS) is NOT impacted and operating normally. We have identified the reason for the degraded performance and are actively working on resolving the issue.
Report: "Search & Contact Filters experiencing intermittent errors and slowness"
Last updateThis incident has been resolved.
Spruce is back to being fully operational. We are monitoring the system to ensure that all continues to look good.
We are continuing to investigate this issue.
Users may be experiencing degraded performance for the following actions: - Searching for contacts or conversations: may be giving errors or returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results Phone calls, video calls, exchanging messages (fax, secure, email or SMS) is NOT impacted and operating normally. We have identified the reason for the degraded performance and are actively working on resolving the issue.
Report: "Spruce experiencing issues with contact and conversation search related activities."
Last update## **Summary \+ customer impact** The Spruce system experienced degraded performance from 10:15am PT to 6:50pm PT on Match 15 2023. During this time: * Searching for contacts or conversations resulted in errors, slowness or stale results * Searching contacts to create conversations resulted in errors or slowness * Bulk messages sent were delayed due to being stuck in Processing state for a while or just taking longer to complete * Opening contact lists likely returned errors or experiencing slowness returning results * Contacts exports were delayed due to being stuck in Processing state for a while or just taking longer to complete The degraded performance was caused due to inefficient data distribution across the search cluster where one of our data nodes experienced heavy load and was unable to process any new indexing and search operations. ## **Analysis** The Spruce engineering team immediately reacted to the monitoring alarms that were triggered to investigate the issue. The issue was caused due to one of the data nodes storing significantly larger amounts of data, compared to the other data nodes in the cluster, which resulted in the node taking longer to process requests. The number of requests piled up over time putting the node under heavier load, and getting to a point where the request queue was exhausted, resulting in new requests being rejected. The engineering team reviewed and analyzed the cluster configuration and performance metrics in detail, and made the following steps to resolve the issue: * Adjusted the data distribution strategy so the data can be equally allocated across the data nodes in the search cluster. The old strategy was inefficient because the size of the stored data grew significantly over time and it could not support the demands of indexing and search operations. * Increased the number of data nodes and reallocated the data equally across the nodes. This was done in the background with minimal impact on the searching and indexing of new data. With the new configuration, data stored was uniformly distributed across all nodes. During the degraded performance, the other data nodes in the cluster were fully operational and any requests that were routed to them were processed successfully. Also, all the indexing requests were successfully queued and processed after the issue was resolved. ## **Action items** * Create additional monitoring alerts for search and indexing operations latency, so that any potential issues can be early detected. * Review the cluster configuration and performance metrics in detail every 3-6 months. * Improve the clean-up strategy for unused data to reduce space usage. * Create an internal strategy with clearly defined steps that can be taken in order to quickly troubleshoot and resolve issues like this one and thus minimize the impact on the clients. * Increase general knowledge about the search cluster and its configuration within the engineering team.
The system is fully operational as per our active monitoring over the last 10 hours. Summary: From 10:15am PT March 15 to 6:50pm PT March 15, the following actions on Spruce experienced degraded performance: - Searching for conversations and contacts either took long or failed - Contact filters frequently failed to load when clicked into - When starting a new conversation, contact suggestions either took long or failed to load, making it challenging to start new conversations - Bulk actions (messaging, tagging, deleting) took longer than expected to complete, but eventually completed - Newly created contacts, conversations and messages during this time period were not searchable. The new items eventually became searchable 6:50pm PT onwards - Successful searches for contacts and conversations may have brought up stale results, where an update to a contact or conversation was not reflected in search results. The updated items were eventually updated in the search to reflect their latest versions There was no impact to calls, SMS, Fax, Secure Messaging, Email or Fax during this time. We will post a postmortem to the incident soon.
The redistribution of data in the cluster is still in progress (note that this happens in the background with minimal impact to searching and indexing of new data). We have been closely monitoring the situation throughout the night. We also increased the capacity of the cluster to accommodate for the redistribution of data and to insure that we are in better shape for today. We have ~20% of redistribution remaining that we believe will have a long standing improvement to the overall performance. The metrics so far are looking healthy with no signs pointing to poor performance or increased error rate. We will report back here once the redistribution completes or if we see any signs pointing to degraded performance.
Indexing of data has now caught up such that successfully searches for any contacts, conversations and messages will bring up up to date results. We are continuing to work on better distributing the data across the cluster. We are not experiencing poor performance or intermittent errors currently. This is likely due to the decreased overall traffic in the system given time of day. That being said, we continue to work on reducing the likelihood of this problem continuing into business day tomorrow.
We have identified a potential cause for the intermittent failures with the search cluster. We are going to work towards better distributing the data across the cluster so as to increase overall performance and reduce the error rate. To recap, due to the errors throughout the day: - Searching for conversations, messages or contacts may have failed - Bulk messages may have taken longer to complete than usual - Newly created contacts, conversations and messages may not have shown up when searching - Updates to contacts may not have been searchable We will continue to work through the evening to reindex the data so as to better distribute it across the cluster and keep this incident up to date as we make progress here. We're really sorry for the inconvenience this is causing to your workflows.
We continue to investigate this issue. Note that some bulk message operations may take long to complete or may get stuck in a particular state given that the bulk message operations also face similar errors when querying contact lists.
We continue to work on the issue here to reduce the intermittent errors while searching for contacts or loading contact lists. Note that there is no impact to phone calls, SMS routing, loading of inbox, exchanging secure messages, or video calling. Bulk messages will continue to send during this time, albeit in a delayed fashion given that bulk messages work off of contact lists. We will update here as we make progress against the performance issue here.
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
We are investigating an issue with contact and conversation search related activities.
Report: "Search related activities experiencing intermittent errors and slowness"
Last updateSystem is fully operational. Summary: From 12:52 PT to 13:57 PT on March 20 2023, the following actions experienced degraded performance: - Searching for contacts or conversations: may be giving errors or returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results
System is mostly recovered and we will continue to monitor to ensure that we do not see degraded performance for any actions. There are still some stale search results being returned for contact and conversation search, but other than that, the system is fully operational at the moment.
We are continuing to work on a fix for this issue.
Users may be experiencing degraded performance for the following actions: - Searching for contacts or conversations: may be giving errors or returning results in a delayed manner or returning stale results - Creating conversations: maybe return errors or experiencing a lag when searching for contacts - Sending bulk messages: may be experiencing a lag where stuck in Processing state for a while or just taking longer to complete - Opening contact lists: may be returning errors or experiencing a lag in returning results Phone calls, video calls, exchanging messages (fax, secure, email or SMS) is NOT impacted and operating normally. We have identified the reason for the degraded performance and are actively working on resolving the issue.
Report: "Spruce Inbox failing to load"
Last update## **Summary \+ customer impact** The Spruce system experienced an outage from 6:45am PT to 9:45am PT on January 25 2023. During this time: * The Spruce Inbox failed to load for patients and providers * Providers could not place voice or video calls to patients given that the inbox would not load * Providers could not send SMS or secure messages to patients * Patients could not send secure messages to practices * Inbound fax, SMS and email arrived to the Spruce inbox in a delayed fashion. * Inbound calls were operational during this time, however voicemails arrived to the Spruce inbox in a delayed fashion. * Workflow automation was executed albeit in a delayed fashion. The outage was caused due to CPU exhaustion on one of the core databases. The engineering team believes the CPU exhaustion to be caused due to a frequently run inefficient query that was optimized as part of the fix. The engineering team will be proactively and closely monitoring the system over the rest of the week to ensure that there is no signs of database CPU exhaustion during peak hours of the day. ## **Analysis** The Spruce engineering team immediately reacted to the monitoring alarms that were triggered to investigate the issue. The issue is believed to be caused due to an inefficient query that is frequently executed across the entire customer base that built up over time and finally tilted one of the database into CPU exhaustion. The engineering team deployed a fix for the inefficient query at around 9am PT. It took 45 minutes to bring the system back up once the query optimization fix was deployed because the engineering team wanted to ensure that bringing the system back all at once did not cause resource exhaustion on various parts of the system. So they brought various components up in a serialized manner while constantly monitoring the CPU utilization on the impacted database. The backup communications system was activated around 8am PT so that any customer had registered their contact information received notifications for inbound calls and SMS over email via a secure expiring URL. ## **Action items** * Make the Backup communications system self-service in the product so that anyone can register for it. * Automatically send an SMS in response to an inbound SMS when the backup system is activated. Currently, we only send an automated text if the provider has signed up for notifications from the backup system, rather for any inbound SMS. * Install a global rate limiter per account to ensure at the API layer so that we have protections around client applications constantly retrying and causing a large spike in traffic when the platform is experiencing issues. * Gain deeper insight into our asynchronous workers through metrics so that we can look at the overall health of the workers running across the platform and ensure there are no runaway workers causing platform wide issues. * Make it part of the engineering on-call person’s responsibilities to proactively monitor database performance metrics and identify any database statements that need optimizing before they build up over time * Clean up ever-growing tables to ensure that they are not impacting general database performance across key services * Investigate how AWS Performance Insights API can be leveraged to automatically notify the engineering team of database queries that take too long to execute or scan too many rows.
The Spruce system experienced an outage from 6:45am PT to 9:45am PT on January 25, 2023. During this time: - The Spruce Inbox failed to load for patients and providers - Providers could not place voice or video calls to patients, given that the inbox would not load - Providers could not send SMS or secure messages to patients - Patients could not send secure messages to practices - Inbound fax, SMS messages, and email arrived to the Spruce inbox in a delayed fashion - Inbound calls were operational during this time; however, voicemails arrived to the Spruce inbox in a delayed fashion - Workflow automation was executed, albeit in a delayed fashion The outage was caused by CPU exhaustion on one of the core databases. The engineering team believes the CPU exhaustion to have been the result of a frequently run inefficient query that built up over time, and which was optimized as part of the fix for this incident. The engineering team will be proactively and closely monitoring the system in the next days to ensure that there is no sign of database CPU exhaustion, including especially during peak hours of the day. We know how important it is for Spruce to be fully operational at all times. Working to build and maintain a medical communications system gives us all immense energy on a daily basis and is not a job we take lightly. We're very sorry for the outage and the impact to practices and patients. We will continue to work hard in pursuit of a highly available and reliable service. If you have any questions at all, please don't hesitate to reach us at support@sprucehealth.com.
The system should be fully functional at this point. We are continuing to monitor the system. We will post an incident report once we've had a chance to investigate more deeply here.
We are starting to see some recovery and are slowly ramping the system back up to fully serviceable to see the impact on database and CPU in general. We'll keep updating this page as we have more to share.
We have made a database optimization for a high frequency lookup. We have intentionally brought down the API layer that clients connect to while we work to ensure that the rest of the system is functional. Once all asynchronous work has been completed, we will turn on the API layer slowly to ensure that we are not seeing any CPU performance issues again.
We continue to investigate the issue with no root cause yet unfortunately. We are all hands on deck working to identify the reason for the outage.
The backup system for notifying providers of incoming SMS, Fax, call events and voicemails has now been activated. Anyone that has registered contact information for our backup system will now get notified over email. You can read more about the backup system here: https://help.sprucehealth.com/article/424-spruce-backup-system
We have not identified the root cause yet. The inbox continues to be down for most. We are actively investigating this issue.
The Spruce inbox is unable to load at the moment and consequently patients and providers are unable to view/send messages, fax or make calls. Inbound calls should be working. Inbound fax likely delayed.
Report: "Video call outage"
Last updateUnannounced maintenance from infrastructure provider. Spruce has contingency plans in the event of longer outages from infrastructure providers, but given the limited scope in this event, there is nothing to do differently at the current time.
Video calling should now be fully functional on the platform now. Please reach out over Team Spruce in the app or support@sprucehealth.com if you continue to have any issues with video calling.
We are seeing recovery with video calls where providers should be able to place video calls to patient successfully. We are monitoring the system to ensure that video calling on Spruce is fully operational again before considering this issue resolved.
Video calls currently are failing due to an outage at our service provider. We are working on a resolution and will update with further information.
Report: "Application temporarily inaccessible & SMS delayed."
Last update**Analysis** Unannounced maintenance from major infrastructure provider. Spruce has contingency plans in the event of longer outages from major infrastructure providers, but given the limited scope in this event, there is nothing to do differently at the current time.
This incident has been resolved.
The application was inaccessible for between 5 to 15 minutes due to unannounced maintenance from our cloud services provider, Amazon. During this maintenance period, SMS, Fax and email will have been delayed.
Report: "Spruce Inbox intermittently failed to load"
Last updateBetween 5:08am and 5:24am Pacific Time on December 19th, the Spruce Inbox failed to load for some users. Users experienced errors opening the inbox, or a particular conversation. Users may have also experienced intermittent failures when placing an outbound call or posting a message or sending a fax. There was no impact to inbound calls or video calls at this time. Some inbound SMS, fax and call events arrived in the inbox in a delayed manner during this time. From our investigation, the intermittent failures were due to one of the services experiencing a malfunctioning task. We plan to look into this further for how to prevent a particular malfunctioning task from causing intermittent platform wide failures.
Report: "Spruce experiencing issues with outbound calls from the android mobile app"
Last update## Summary At 8:23am PT, we released software that resulted in outbound calls from the Android app failing to work. This was due to software that was incompatible with the latest version of the Android app. We identified the issue at 9:04am PT and quickly root caused it to the software rollout. We deployed a fix at 9:08am PT. Outbound calls on the Spruce Android app were non-functional for customers running the latest version of the app from 8:23am PT - 9:08am PT \(45 minutes\). Outbound calling from desk phones, iOS or web-apps were not impacted. No other aspect of the system was impacted. We are internally discussing ways to reduce the likelihood of incompatible software rollouts like the one that happened today. We understand how important it is for providers to be able to place outbound calls from all platforms at all times, and we’re very sorry for the inconvenience and the business impact here.
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating an issue with outbound calls from the android mobile app
Report: "Contact lists and search results failing to load intermittently"
Last updateAs of 12:55pm PT on October 25 2022, we started to see recovery and found the system to return to being fully operational. There was a re-allocation of data across the nodes that store and process the search engine data. The re-allocation seems to have helped the overall recovery. We will continue to investigate ways in which we can further optimize our search engine database to prevent such spikes in search latency.
Note that while we are experiencing this performance issue, new contacts, updates to contacts and messages may not be reflected in search results. This is because there is a backup in indexing of new data into the search engine. Also note that failures may be intermittent. Contact lists may fail to load but then if you refresh the page it may end up loading. This will also be the experience with conversation and contact search results.
The performance issues seem to have started again as of 11:02am PT today, despite the infrastructure improvements. We are continuing to investigate to identify an performance bottlenecks.
We have made some underlying infrastructure changes that should help alleviate the performance issues that we were experiencing. We will continue to monitor the overall health of the database that backs searches and contact lists during normal business operations today (October 25 2022) to ensure that the issue is fully resolved and the platform is back to being functional.
We are currently investigating an issue where contact lists and search results fail to load intermittently. There should be no impact to the Spruce inbox, inbound/outbound calls, video calls or exchanging of messages over SMS, secure messaging, email and fax.
Report: "Delayed SMS delivery to AT&T phone numbers"
Last updateOur telecom partner updated us last evening around 7:27pm (October 19) that the issue was resolved on their end and they were no longer seeing delays in sending SMS to AT&T.
Our telecom partner just posted an update saying that they continue to experience messaging sending delays and they are working closely with their partner to resolve the issue. They will post an update soon, likely in the next 6-8 hours on this matter.
Our underlying telecommunications infrastructure provider is experiencing SMS delivery delays when when sending messages to AT&T phone numbers. The message may appear as sent in your Spruce application, but may take minutes to get to the recipient. This incident with our underlying provider is also impacting verification code delivery over SMS. So if a patient or provider tries to create an account or log in, and their cell phone number is with AT&T, they may not get the verification code for minutes. In this case, it is likely best for the user to try the "Call me with code" option accessible on all platforms. We are actively monitoring this incident and will post an update as soon as we have one.
Report: "[RESOLVED] Delayed inbound sms, voicemails and call events"
Last update# Summary The reason for the delayed messages was because of communication with our transcription provider timing out to transcribe voicemails. The timeout on uploading a recording to the provider was not correctly tuned, leading to a build-up of messages that needed to be processed by a set of application workers and causing a backlog of messages that needed to be processed. The messages were being processed albeit in a delayed manner due to the communication issues. Spruce was made aware of the issue via multiple customer complaints and the engineering team started investigating as soon as the issue was escalated. # **Action items to mitigate future impact** * Add an alarm on the application worker responsible for processing transcriptions and SMS. Note that we already had alarms in place for all but one of the workers. This will help ensure that should an issue like this arise again, we’ll be notified asap. * Fine tune the timeout in communication with the transcription provider to prevent a build-up in the event of communication errors in the future.
The issue was resolved and the system returned to being fully functional at around 11:53 am PT.
From 10:05am PT to 11:53am PT on October 18 2022, voicemails, inbound sms and inbound call events reached provider's Spruce inboxes in a delayed manner. The events that were delayed had an indication in the message itself for how long they were delayed by. There was no impact to inbound calls, outbound calls, secure message exchanges, video calls, email or fax. Spruce identified this issue in response to customer complaints rather than the proactive monitoring in place for the system in general.
Report: "Spruce-Elation integration is erroring"
Last update**Summary** Elation changed the error code being returned to indicate expired credentials. Spruce had to update the error code they were looking for to refresh credentials before making an API request. Spruce was notified of this issue due to a customer complaint, 10 hours after the issue started. Spruce identified that there was no code deployment on their end, so engaged Elation to help investigate. **Action items to mitigate the impact in the future** * Spruce to put in a fix to resolve the issue \(done\) * On Elation end \(not in our control so cannot say if will be done or not, but Spruce has communicated with Elation on these\): * Elation to consider reverting the error code change * Elation to consider improved error messages if credentials have expired * Spruce to have a quicker escalation path with Elation \(Elation shared email addresses that Spruce should reach out to, in addition to Slack\) * Spruce to investigate how they can get proactively notified of elevated integration related error
We deployed a fix at 11:08 am PT Oct 12 2022 and can confirm as of 11:15am PT that the Spruce-Elation integration is now fully functional again.
We have identified the cause to be a deployment by the Elation development team last night (around 9:40pm PT Oct 11 2022) where there was a change in assumptions for when an API partner (such as Spruce) should re-authenticate credentials for an integration. Consequently, most integrations were failing given that Spruce was not refreshing credentials when it should. We are making a change to adapt to the updated assumptions, while the Elation team is evaluating if they should revert back to previous assumptions. We should have a fix rolled out in the next 30 minutes.
We are currently experiencing an issue where the Spruce-Elation integration is erroring. Any patient a provider adds to Elation may not automatically show up in Spruce, if that's what you expect. Any demographic changes made to a patient in Spruce or Elation will not sync to Elation or Spruce. Any attempt to sync messages from Spruce to Elation will not work. We are investigating this issue in partnership with Elation as we speak to root cause it.
Report: "Sent messages reappearing in the composer bar immediately after being sent"
Last updateFrom approximately 1:05 AM to 11:23 AM Pacific time, the Spruce web app would intermittently restore an already-sent message to the compose bar.
Report: "Patients' inbox and messages failing to load on web"
Last updateDuring approximately 11 hours (00:09 to 11:07 PDT), patients were not able to view conversations or messages on the Spruce web app. This impacted most patients who used the web app (the exception being those loading the web app from a browser bookmark). The issue was caused by a server-side change in logic intended to increase performance, which caused unexpected conditions for the patient web app. The result was that patients were not able to see conversations during this period.
Report: "Delayed notifications"
Last update# Summary Our cloud infrastructure provider, AWS reported elevated errors and latency for the System Manager Parameter Store service in US-East-1. You can see their report [here](https://www.loom.com/i/ac62abd3a9c5448195304c2f2c65c25b). This AWS incident impacted the ability for AWS Lambda functions to execute since they could not read parameter values from the parameter store. This AWS incident in turn impacted the Spruce platform as described below. # Context The Spruce platform leverages AWS Lambdas \(server-less functions\) to process inbound SMS, user-facing app-based push notifications, and badge count updates for smartphones and the web. The benefit of the Lambdas is that they automatically scale up in times of high demand and scale down to maintain a minimum number of serverless functions. The Spruce platform has designed redundancy in place to process inbound SMS and app-based push notifications without needing AWS Lambdas. In the case of failure for AWS Lambdas to execute, inbound SMS is received by an application API via a fallback webhook from our telecommunications infrastructure provider \(Twilio\). App based push notifications are processed by application level workers that are running in the same tasks that service the rest of the platform and are listening on the same distributed queues that the AWS Lambdas listen on. Every application level task also depends on AWS Parameter Store to pick configuration values. The AWS Parameter store is accessed by each application instance at the time of startup to pick up the configuration values and then used for the lifecycle of the application task until the next deployment or instance replacement. # Impact **Good news:** Given the fallback logic in place as described above, inbound SMS and user-facing app-based push notifications continued to function without any impact to Lambdas failing. All existing application level tasks continued to operate normally as well since they do not rely on the AWS Parameter Store for normal functioning and only rely on it at startup time. **Where the impact was felt:** There is typically a limited number of application-level workers running to service badge-count updates and user-facing app-based push notifications. While there was no impact to the user-facing app-based push notifications, badge-count updates were delayed because they are high throughput \(also provide real time updates to the web\) given the limited number of workers available to service the badge-count updates. While typically, the Spruce engineering team can easily increase the number of tasks to keep up with and service the badge count updates, in this case that was not possible given that new tasks would not start up given their reliance on AWS Parameter store to pick up their configuration value. So the engineering team decided to stay put and survive with the delayed badge-count updates knowing that the platform was operating normally otherwise. The user facing impact was as follows: * The unread badge count on the Spruce application on smartphones did not update in real time to reflect the right badge count * The web-app did not update in real time as it typically does given that the web relies on the badge-count updates to refresh it’s state. # **Action Items** We have 1 solid action item that will improve the overall platform here, which is to reduce dependence on AWS Parameter store by having an in-memory cache \(Redis\) as a fallback. Each time an application task starts up, it will write its configuration values to Redis. If a task cannot access AWS Parameter store, it will then fallback to access the values from Redis. We are prioritizing this change given the immediate impact it can bring to the system. This change alone will make it so that in the future, if AWS Parameter store is impacted, we can continue to bring up as many application level tasks as we’d like and AWS Lambdas would continue to function normally as well.
We have started seeing recovery as of 10:34am PT. All badge count updates are processing normally and without delay now. AWS just confirmed (as of 11:27am PT) what we are seeing, that they too are seeing recovery. We will resolve this incident for now since the system is back to operating normally. If you have any questions or concerns please don't hesitate to reach out to us via the Spruce Support conversation in app or support@sprucehealth.com.
We have identified that Spruce app notifications are actually processing normally without a delay. So app notifications and video calling notifications are working just fine. We had misdiagnosed the issue. It is only badge count updates that are delayed at this point. So if you receive a new Page or if you receive a new message, your badge count will not update. But you will see the app notification on your smartphone.
We are continuing to investigate this issue.
Spruce app push notifications are currently delayed due to an issue with our underlying cloud service provider (AWS). We are continuing to monitor the situation and will post an update as soon as we have one to share. Video call notifications may also be impacted as a result. So if you are engaging in video calls with your patients, please ask them to have the Spruce app open while waiting for a video call from you. Inbound/outbound calls, secure message exchanges, email, fax and SMS all continue to operate normally.
Report: "Spruce inbox failing to load"
Last update**Analysis** One of the instances of a service reached an unhealthy state. Any requests to the instance errored with a timeout given that the instance was unresponsive. It took ~10 minutes for the platform to self-heal and terminate the instance and replace it with a healthy one. Each client \(user’s web app or smartphone\) sends requests to the API layer that round-robins requests to appropriate services to process requests. The round-robin logic is currently not advanced enough to filter out any unhealthy requests, consequently leading to intermittent failed requests until the instance is replaced. **Action items** * Explore improving client-side load-balancing logic or using a service mesh to improve observability and reliability of intra-service communication
The Spruce Inbox failed to load intermittently for many customers between 11:50am PT and 12:03pm PT. This was due to one of the services reaching an unhealthy state where the one of the instances of the service became unresponsive. Any request from the client to that unhealthy instance resulted in a timeout which to the user resulted in a failed load of the inbox or a particular conversation. Inbound/outbound calling, inbound SMS/Email/Fax/Secure messages were not impacted during this time. Users may have been unable to send a message from the Spruce app if their request hit the unhealthy instance, but a subsequent retry likely resolved the issue.
System automatically resolved itself after identifying the unhealthy task and replacing it with a healthy one
Spruce inbox is intermittently failing to load for users. Our engineering team is on it and investigating.
Report: "Call failures and SMS delayed"
Last updateOur telephony infrastructure provider has reported that all services impacted are now fully operational. Any delayed SMS should now be delivered to inbox. All inbound calls that failed were logged as system failures in the inbox. We apologize for the inconvenience this caused, and will be working with our telephony partner to identify the root cause for this issue.
Video calls are also impacted at this time. Our underlying telephony infrastructure provider is experiencing issues that is resulting in the partial outage on our end for video calls and phone calls.
We are investigating failures with inbound and outbound calls. Inbound SMS is also likely delayed at this time. There is no impact to secure messaging, fax or the Spruce inbox at this time.
Report: "Issues with inbound calls."
Last updateThis incident has been resolved.
The issue seems to have been resolved by the provider.
Outbound calls are functional at this time. So if a call drops when received, you should be able to place an outgoing call to the phone number to successfully connect with the caller.
Our telephony provider is currently experience issues that are affecting some inbound calls. They are investigating and we will update as we know more.
Report: "Outbound calls and SMS experiencing issues"
Last updateSystem has remained fully operational for the past hour. Twilio has confirmed that the system recovered when we thought it did. They're monitoring the issue to ensure that their system remains fully resolved. Marking this incident as closed at this point. Incident started at 12:20pm PT and ended at 1:02pm PT.
No longer seeing any errors pertaining to outbound calls and SMS, but we haven't received confirmation from Twilio just yet the things are resolved. Going to keep monitoring for a bit, but going to consider the system operational at this point. Note that no inbound or outbound SMS were missed. If an outbound call didn't go through then you can try again now and it should.
System is showing early signs of recovery. Outbound calls and SMS no longer showing errors. We'll keep monitoring the issue to confirm that it is fully resolved and we have confirmation from Twilio of the same.
Inbound calls and SMS are functional. Outbound calls are intermittently failing. Outbound SMS from the app is delayed. Our underlying cloud telephony provider, Twilio, has acknowledged this to be an issue on their end and are currently investigating. We'll provide an update as soon as we have more to share.
We are investigating an issue where inbound and outbound calls are failing to connect, and outbound SMS is failing to send. Inbound SMS may be impacted as well. We'll keep this incident page posted as we learn more.
Report: "Spruce system availability issues"
Last update**Incident time**: 7:32am PT - ~2pm PT **Summary**: Amazon Web Services \(AWS\) us-east-1 region experienced API related issues that made it so that their console wouldn't load and services were impacted due to impaired network connectivity. This resulted in the following customer facing issues: * Inbound calls may have failed to route to appropriate agents * Inbound SMS, voicemails and call events may have failed to deliver to the inbox. We will be retroactively delivering these to the appropriate inbox. * Notifications were significantly delayed, which resulted in patients missing incoming video calls * Due to the delayed notifications, many providers could not connect with patients over video calls * Autoresponders failed to trigger during the incident * Workflows that customers have in place likely did not trigger accurately during this time
Notifications are no longer delayed, and the system is operational again. We'll be doing some more investigations internally to see how we can gain more insight and hopefully reduce the likelihood of delayed notifications in the future.
Notifications continue to get delivered in a delayed fashion, and sometimes in quick succession if there are a series of notifications pertaining to a particular user. When the user clicks on a notification, they may not find a new message because the notification may be for a message that they have already read. Patients may also be getting delayed notifications indicating an incoming call from a provider. The reason for these delayed notifications appears to be AWS delaying the delivery of notifications to Apple and Google servers to deliver them to the end user on iPhone or Android. While some of this is out of our control in the last mile message delivery, we're doing all we can to monitor the situation, provide updates and identify ways in which we can improve the system for the long term.
Still monitoring. Only known system wide impact at this time is delayed notifications. Note that the delayed notifications may impact the incoming video call notification for patients, thereby making it so that patients are unaware of a video call from their provider. If you're looking to engage in video calling with your patients, we suggest that the patient has the Spruce app open while waiting for your video call.
Quick update on overall health of system thus far: - App notifications are delayed. - Spruce inbox functional for most, may be slow to load from time to time. - Sending and receiving of secure messages should be functional. This includes internal notes, secure messages between patients and providers, and team conversations. - Inbound calls are functional, though there continue to be intermittent failures. - Inbound SMS is functional, though there continue to be intermittent failures. - Outbound Call and SMS should be functional - Fax remains functional - Video calling remains functional, though there may be intermittent failures since Twilio is reporting failures. We will continue to monitor the issue and post updates.
In our testing, inbound calls are routing to the appropriate agents at this time. It's still possible that there are intermittent issues for some given that Twilio continues to report issues with call routing.
Inbound SMS is being accepted by our system now, though there may still be intermittent failures in SMS being accepted.
Given the underlying AWS issue, the system is impacted in the following ways: - Spruce inbox msy be slow to load - Inbound calls are not being directed to the appropriate agents - SMS is not being accepted by our system. Our underlying telecom provider, Twilio, is also reporting issues, given that it relies on AWS. - Outbound calls and SMS seems functional. - Posting messages into conversations within Spruce both for patients and providers seems to be functional
It appears to be an issue with our underlying cloud service provider, AWS, that is experiencing an outage. While the AWS status page does not report an issue, several people on the internet are reporting similar issues with their AWS environments, as you can see here: https://downdetector.com/status/aws-amazon-web-services/
We are currently investigating error rates for inbound calls and SMS, and the Spruce inbox potentially being slow to load. The issue started around 7:32am PT.