Historical record of incidents for Medable
Report: "SMS Notifications not functioning for US numbers."
Last updateWe are currently investigating an issue where SMS notifications are not being delivered to US phone numbers.
Report: "Email notifications not functioning"
Last updateOur email delivery provider stopped delivering emails for 4 hours due to an unforeseen account related issue. This has been resolved.
Report: "Outage on US Prod"
Last updateOn 11/8/2022 from 1:18am PST to 2:28am PST the Web Application proxy service went down rendering all web applications unavailable, Including Study Manager and Patient App Web. Upon investigation, action was taken to restore the service and all applications were available shortly thereafter.
Report: "Outage on US Prod"
Last updateWe had a breif (5 min outage) on our US prod environment due to database maintenance. This was not expected to cause impact, but unforeseen factors required us to redeploy the API services after we finished the DB maintenance activity. We do not expect this to be repeated.
Report: "API Outage on our Development environmment"
Last updateThis incident has been resolved.
All services are operational.
We are currently investigating an issue affecting our Development environment in North America. More updates to follow shortly
Report: "API Outage"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "API Outage"
Last updateThis incident has been resolved. A review of todays outages is underway and updates and RCA will be provided soon.
We are currently investigating this issue.
Report: "API Outage"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "API Outage"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "File access and uploading unavailable in dev environment"
Last updateFrom 19:00 UTC - 19:20 UTC today, access to files on the Medable platform in the US development environment were unavailable. This impacted any user attempting to upload or download files via the cortex api and it also impacted web applications that were served up from this environment as well. Medable identified this issue immediately and worked with our infrastructure providers to restore functionality. We apologize for the disruption this may have caused our customers in their development environments and we are working with our infrastructure providers to ensure this issue cannot happen again.
This incident has been resolved.
We are currently investigating this issue.
Report: "API Outage"
Last updateWe experienced an API outage with our production environment on 13Apr2020. Our teams identified this at 18:21 UTC and worked to resolve the issue. This was a Severity 0 - Critical \(according to the Medable Service Level Agreement\) issue and was resolved by 19:18 UTC. We are currently completing our Incident Management Documentation and a full root cause analysis \(RCA\). Once the RCA is complete, we will be updating this space with the details as well. We apologize for the inconvenience and appreciate your patience while we resolved the issue.
Platform availability was restored at 19:18 UTC. We've been monitoring the fix and will continue to monitor.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Performance issue (TLS handshake on API endpoints)"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating an issue where some endpoints are having difficulty establishing TLS handshake.
Report: "Dev API Outage"
Last update#### Cause A routine update in our deployment layer resulted in network connectivity issues between nodes. This resulted in services that were functioning normally to appear down to load balancing and proxy services. This ultimately resulted in the api being inaccessible externally. #### Resolution Upon identifying the cause, we worked to re-establish connectivity internally, restoring load balancing and proxying services. #### Prevention New maintenance processes have been put in place for the nodes in question. Preventive maintenance measures will take place on non-production nodes that will allow for testing and verification of the updates before they can impact production services. In production, these updates will then be applied during scheduled maintenance so that the impacts can be closely monitored.
Dev API restored. We will update with details on the cause shortly.
The development api is experiencing an outage. We are investigating now and will update with more details as we have them.
We are currently investigating this issue.
Report: "Production API outage"
Last update#### Cause A routine update in our deployment layer resulted in network connectivity issues between nodes. This resulted in services that were functioning normally to appear down to load balancing and proxy services. This ultimately resulted in the api being inaccessible externally. #### Resolution Upon identifying the cause, we worked to re-establish connectivity internally, restoring load balancing and proxying services. #### Prevention New maintenance processes have been put in place for the nodes in question. Preventive maintenance measures will take place on non-production nodes that will allow for testing and verification of the updates before they can impact production services. In production, these updates will then be applied during scheduled maintenance so that the impacts can be closely monitored.
The outage has been resolved. All production services are operational. We will report back with an analysis of the situation shortly.
Medable is experiencing an outage in the production api environment. The issue is under investigation, currently.
Report: "Cortex Web App Connectivity Issues (Dev and Prod)"
Last updateThe issue has been resolved. We are continuing to monitor our infrastructure provider's response to the issue and will provide updates here.
We are monitoring our infrastructure provider's response to the issue and will update accordingly.
The issue has been identified as a networking issue with our infrastructure provider's load balancers that is leading to 502 errors being returned to clients.
We are continuing to investigate this issue.
We are currently investigating an issue impacting the cortex web apps in prod and dev. The Cortex API remains available. Customer end-user applications are not impacted.
Report: "Outage"
Last updateAll services have been restored. We are working with our infrastructure partners to identify the root cause and steps to prevent the issue in the future.
We are investigating an outage across dev and production. We will report back with more details as we have them.
Report: "Outage across all services"
Last updateAll services restored. The cause was a brief routing issue that was quickly identified by monitoring agents and mitigated. The affected timeframe was approximately 5 minutes. We have worked with our partners to add additional safety checks to prevent this issue from occurring again.
We are experiencing an outage across all services. We are investigating the cause and will report back with more details.
Report: "Performance degradation in development environment"
Last updatePerformance in the dev environments has been restored to normal. We will continue to monitor dev closely and will report back with details on the resolution.
We continue to investigate the timeout issue and are exploring a number of paths. We are very sorry for the interruption this has caused in development efforts for many organizations. We will continue to work as quickly as possible to resolve this issue and resume normal operations in the development environments.
We continue to investigate the timeout issues in api.dev. As we discover more details we will present them here.
We are investigating an issue in the development environment that is leading to timeout errors through the API.
Report: "Database performance degradation"
Last updateThe db replica causing the degraded performance has been replaced successfully. Affected orgs should see query performance returned to normal.
We've identified a performance issue in a database cluster member. We are working to resolve the issue as quickly as possible and will report back with more details.