Historical record of incidents for Respondus
Report: "Respondus Monitor LiveChat Service Notice"
Last updateThe Respondus Monitor LiveChat service is experiencing an outage. This is not affecting the Respondus Monitor application itself, only the LiveChat help system built into it. The underlying LiveChat technology is from a third party. Status and updates on this issue can be followed here: https://status.livechat.com/.
Report: "StudyMate Campus outage"
Last updateThe service is functioning normally.
The service has resumed and we are monitoring the system.
We are currently investigating an outage of StudyMate Campus.
Report: "StudyMate Campus outage"
Last updateWe are currently investigating an outage of StudyMate Campus.
Report: "Respondus Monitor Service Notice"
Last updateRoot Cause Analysis for Service Slowdown and Interruption March 5, 2025 We sincerely apologize for the service issues on March 5th that affected some of our applications. We understand how much your institution relies on our services, and we regret any inconvenience this may have caused. Below, we’d like to provide details on what happened and the steps we are taking to prevent future occurrences. At approximately 10:00am PST on March 5, 2025, we started receiving alarms indicating a rapid increase in the failure rate of students starting and completing the pre-exam steps in Respondus Monitor. However, everything appeared normal in the application servers in terms of processing requests, server load, memory usage, and thread and connection counts. Requests were serviced without errors and within our maximum target of 250ms response time. But while all health checks within the AWS environment indicated the servers were healthy and operating normally, we began receiving alerts from monitoring services that run external to AWS showing elevated response times and intermittent failures. We then focused on the AWS load balancer service \(ALB\) which distributes incoming traffic to the application servers. This service is fully managed by AWS, and hence, we don’t have good insight into the health of the appliance, nor can we restart the nodes, etc. After examining the access logs for the load balancer, we saw a very high number of Error 460 entries, which according to the AWS documentation means: "Client errors are generated when requests are malformed or incomplete. These requests were not received by the target, other than in the case where the load balancer returns an HTTP 460 error code. This count does not include any response codes generated by the targets." Our initial thought was that a DDoS attack was flooding the load balancer with malformed packets. But this was quickly ruled out because the requests associated with the errors looked normal in the access logs. Additionally, our Global Accelerator endpoints are protected by AWS Shield, which should stop network layer attacks before they reach the load balancer. However, this explained why the application servers appeared to be operating normally – because a large percentage of requests were not reaching them. Engineers at AWS said the 460 errors would also occur if the client \(LockDown Browser\) closes the connection before the load balancer sends the response. This seemed implausible because there hadn’t been recent updates to the client applications. Moreover, such an issue would emerge gradually – over days or weeks – based on how we introduce new releases. This event escalated in minutes. Concurrently with our investigation, we performed two rolling restarts of the application servers which didn't result in much improvement. Given these symptoms, we suspected the issue might be with the load balancer service itself. We decided to terminate all application servers at once \(to fully scale down the load balancer\) and then launch new application servers \(to scale up the load balancer, but on different nodes\). This had the immediate effect of restoring the service, and our initial theory was that the load balancer had gotten into a bad state. After further investigation, however, we determined the problem wasn’t with the load balancer. The root cause was that a bandwidth limit had been reached on the elastic network interfaces attached to the application server instances, resulting in a bottleneck that continued to grow in the early stages as failed requests were retried. This service event primarily entailed slowdowns in request responses and intermittent failures. Once students entered an exam, this event would not have affected them until after the exam was submitted on the learning system. At that point, students may have experienced delays or failures as they attempted to exit the Respondus Monitor system. The only time there was a complete outage was when everything was shut down for a few minutes to restart the entire service. We have since performed a detailed analysis of the event and have configured new alarms that trigger auto-scaling of application servers before the network bandwidth limit is reached. We have also increased the minimum number of application servers which will smooth the opening minutes of a massive autoscaling event. Finally, we also want to note that the StudyMate Campus service was similarly affected during this event, as were users trying to start exams using the Chromebook version of LockDown Browser. In the latter case, the impact was due to how the Chromebook extension retrieves settings at startup.
This incident has been resolved.
All systems are operational and working properly. We will provide details at a later time about the investigation.
We are continuing to investigate this issue.
We are seeing improvements across the system. We will continue to monitor it closely and will provide details later about our investigation.
This issue is ongoing and we continue to work with AWS on a resolution. Respondus Monitor users are experiencing very slow response times during the pre-exam steps. Chromebook users are experiencing an outage.
We continue to work with AWS to troubleshoot this issue. Service continues to be degraded for Respondus Monitor. The same issue is impacting Chromebook users of LockDown Browser.
We continue to experience problems with users being able to start new Respondus Monitor sessions. We will provide another update shortly.
The Respondus Monitor service is currently experiencing a slowdown. We are investigating the source of the problem.
Report: "Respondus Monitor service notice"
Last updateThe service is now operating normally. The earlier slowdown is being investigated.
We are continuing to investigate this issue.
We are investigating an issue where some Respondus Monitor users are unable to complete the steps that occur prior to the start of an exam.
Report: "Respondus Monitor Service Notice"
Last updateThe source of the problem has been identified and fixed. We do not anticipate further issues relating to it.
On October 16, 17 and 18th there were single incidents that resulted in a disruption of the Respondus Monitor service lasting approximately 2 minutes. Application servers failed a health check, and were automatically replaced by new ones to resolve the issue. We are investigating the root cause of the health check failure. During these brief disruptions, test takers who were underway with an exam were unaffected. It primarily affected students in the pre-exam and post-exam stages by presenting a “gateway error” message. We are investigating the issue and will update this notice once the problem has been resolved.
Report: "StudyMate Campus outage"
Last updateThe StudyMate Campus service has been restored. We are investigating the cause of the brief outage.
We are continuing to investigate this issue.
The StudyMate Campus service is currently experiencing an outage that we are investigating. This is not related to LockDown Browser or Respondus Monitor.
Report: "Respondus Monitor LiveChat Service Notice"
Last updateResolved: The Respondus Monitor LiveChat service is functioning normal now.
The Respondus Monitor LiveChat service is experiencing an outage. This is not affecting the Respondus Monitor application itself, only the LiveChat help system built into it. The underlying LiveChat technology is from a third party; status and updates on this issue can be followed here: https://status.livechat.com/.
Report: "Respondus Monitor LiveChat Service Notice"
Last updateThe Respondus Monitor LiveChat service is functioning normal now.
The Respondus Monitor LiveChat service is experiencing an outage. This is not affecting the Respondus Monitor application itself, only the LiveChat help system built into it. The underlying LiveChat technology is from a third party; status and updates on this issue can be followed here: https://status.livechat.com/.
Report: "Respondus Monitor LiveChat Service Notice"
Last updateThe Respondus Monitor LiveChat service is functioning normal now.
The Respondus Monitor LiveChat service is experiencing intermittent issues. This is not affecting the Respondus Monitor application itself, only the LiveChat help system built into it. The underlying LiveChat technology is from a third party; status and updates on this issue can be followed here: https://status.livechat.com/.
Report: "Respondus Monitor service notice"
Last updateAmazon identified a problem with the Amazon Cognito Identity Pools in the US-EAST-1 Region. The AWS issue appears to be resolved and the service is returning to normal.
We are currently investigating an issue with users being unable to start a Respondus Monitor session.
Report: "Respondus Monitor service notice"
Last updateThe issues reported by AWS have been resolved. There was no impact to LockDown Browser or Respondus Monitor users during this time.
On Tuesday, June 13 2023, at approximately 12:00 pm PST (19:00 GMT Tuesday), AWS began reporting degradation across several services, including access to the AWS Management Console. We don’t currently have reports of issues with LockDown Browser or Respondus Monitor, but are carefully watching the situation.
Report: "Respondus Monitor service notice"
Last updateThe problem affecting Respondus Monitor is resolved. We continue to investigate the source of the problem which affected the start of new Respondus Monitor sessions for approximately 10 minutes.
The problem appears to have been resolved and service is returning to normal.
The Respondus Monitor service is currently experiencing some issues that we are investigating.
Report: "Respondus Monitor service notice"
Last updateThe AWS problems have resolved and operations are back to normal.
There is a widespread AWS outage that is currently affecting Respondus Monitor - https://health.aws.amazon.com/health/status We are monitoring the situation closely.
We are currently investigating an issue with users being unable to start a Respondus Monitor session.
Report: "Blackboard Ultra customers cannot use LockDown Browser or Respondus Monitor"
Last updateBlackboard has provided a fix for all affected Blackboard Ultra production instances and will be updating customer test/staging servers on Friday 4/22.
The issue has been identified and Blackboard is rolling out a fix to the affected sites.
We are currently investigating an issue with the latest update of Blackboard Ultra.