Is BenchPrep Down Right Now? Discover if there is an ongoing service outage.

BenchPrep is currently Operational

Last checked Jul 29, 2025 14:31 UTC from BenchPrep's official status page

Historical record of incidents for BenchPrep

Jul 18, 2025

Report: "Maintenance"

Last update 2025-07-18T18:39:40.885Z

resolved2025-07-18T18:39:40.862Z

The maintenance completed. All learner facing and administrative applications are operational.

identified2025-07-18T18:29:47.172Z

We are completing an operational step resulting from earlier emergency maintenance. During this time, some users may experience temporary disruptions across multiple services. We anticipate the maintenance to be brief and expect services to stabilize within a few minutes

Report: "Maintenance"

Last update 2025-07-18T16:20:13.143Z

resolved2025-07-18T16:19:43.000Z

The maintenance completed. All learner facing and administrative applications are operational.

identified2025-07-18T16:09:52.811Z

We are currently performing emergency maintenance on all applications. During this time, access to all services may be temporarily unavailable. We anticipate the maintenance to be brief and expect services to resume within a few minutes.

Jun 4, 2025

Report: "Degraded Perfromance"

Last update 2025-06-04T14:22:09.436Z

resolved2025-06-04T14:22:09.417Z

This incident has been resolved.

monitoring2025-06-04T13:34:11.484Z

A fix has been implemented, we have confirmed the applications are operational and we are monitoring current performance.

investigating2025-06-04T13:25:23.998Z

We're continuing to investigate the issue and are currently testing a potential change to improve connectivity.

investigating2025-06-04T12:46:40.286Z

We are actively investigating an issue with degraded performance across our applications.

Report: "Degraded Perfromance"

Last update 2025-06-04T14:22:00.000Z

Resolved2025-06-04T14:22:00.000Z

This incident has been resolved.

Monitoring2025-06-04T13:34:00.000Z

A fix has been implemented, we have confirmed the applications are operational and we are monitoring current performance.

Update2025-06-04T13:25:00.000Z

We're continuing to investigate the issue and are currently testing a potential change to improve connectivity.

Investigating2025-06-04T12:46:00.000Z

We are actively investigating an issue with degraded performance across our applications.

May 22, 2025

Report: "Release Maintenance"

Last update 2025-05-22T23:45:00.000Z

Completed2025-05-22T23:45:00.000Z

The scheduled maintenance has been completed.

Update2025-05-22T21:41:00.000Z

We are continuing to verify the maintenance items.

Verifying2025-05-22T21:14:00.000Z

Verification is currently underway for the maintenance items.

In progress2025-05-22T21:00:00.000Z

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled2025-05-19T10:43:00.000Z

We are planning to have a short maintenance window. While we are expecting performance impact to be minimal for most of our customers, it is possible that some customers will experience degraded performance or downtime for a period of up to 15 minutes.

Feb 13, 2025

Report: "Degraded Performance - Boost Dashboard and Institution Admin"

Last update 2025-02-13T15:53:18.409Z

resolved2025-02-13T15:53:18.394Z

This issue has been resolved.

monitoring2025-02-13T15:43:28.503Z

A fix has been implemented and we are monitoring the results.

investigating2025-02-13T15:28:49.422Z

We are actively investigating an issue causing the Boost Dashboard and Institution Admin to display incorrectly. The Learning Application is not affected.

Feb 10, 2025

Report: "Downgraded Performance - Boost Dashboard"

Last update 2025-02-10T14:31:23.726Z

resolved2025-02-10T14:31:23.710Z

This incident has been resolved.

monitoring2025-02-08T06:10:33.369Z

The dashboards have been successfully updated, and we are actively monitoring the ongoing update.

investigating2025-02-06T22:56:44.776Z

We are continuing to monitor the ongoing update and investigating the root cause of the issue.

investigating2025-02-06T16:43:51.986Z

We are investigating the reoccurrence of this issue while closely monitoring an ongoing update. We will share further updates.

monitoring2025-02-05T21:02:47.506Z

The dashboards have been successfully updated, and we are actively monitoring the ongoing update.

investigating2025-02-05T18:58:25.465Z

We are actively investigating this issue while closely monitoring an update that is currently in progress.

investigating2025-02-05T15:17:30.990Z

We are currently investigating an issue with our Boost Dashboard application. The issue does not impact any learner facing applications or other admin applications.

Jan 14, 2025

Report: "Downgraded Performance - Reporting Dashboards in Console"

Last update 2025-01-14T15:04:36.705Z

resolved2025-01-14T15:04:36.692Z

This incident has been resolved.

monitoring2025-01-14T02:14:57.175Z

Vendor issue resolved at 19:50 CT. We are monitoring and working with vendor to get issue and resolution explanation

investigating2025-01-13T23:38:26.873Z

We are currently investigating an issue with Console Analytics dashboards not displaying any data. Our team is actively working with a third-party vendor to resolve. This issue is isolated to the Console Analytics section and does not impact the Learning Application.

Aug 28, 2024

Report: "Downgraded Performance - Boost Dashboard"

Last update 2024-08-28T17:38:16.143Z

resolved2024-08-28T17:38:16.132Z

This incident has been resolved.

monitoring2024-08-28T16:47:24.829Z

We have confirmed Boost application was successfully updated, we are monitoring the results.

investigating2024-08-28T14:56:56.591Z

We are currently investigating an issue with our Boost Dashboard application. The issue does not impact any learner facing applications or other admin applications.

Jul 26, 2024

Report: "Degraded Performance"

Last update 2024-07-26T15:08:43.697Z

resolved2024-07-26T15:08:43.684Z

This incident has been resolved.

monitoring2024-07-26T15:05:31.457Z

A fix has been implemented and we are monitoring the results.

investigating2024-07-26T14:57:32.695Z

We are currently investigating an intermittent degraded performance of the Learning Application and Admin Tools due to issues within one of our data centers. We are replacing the impacted node and expect services to be fully restored shortly.

Apr 18, 2024

Report: "Console - embedded dashboard reporting"

Last update 2024-04-18T18:54:33.548Z

resolved2024-04-18T18:54:33.519Z

This issue has been resolved.

investigating2024-04-18T15:20:50.802Z

You might experience a slight disruption in Console's embedded dashboard reporting. We will provide an update on the issue. This does not affect learner applications.

Feb 9, 2024

Report: "Downgraded Performance - Learning Application - Interactions"

Last update 2024-02-09T14:54:10.569Z

resolved2024-02-09T14:54:10.545Z

A fix has been implemented and the issue has been resolved.

identified2024-02-09T14:34:01.793Z

The issue has been identified and we are actively working on a solution.

investigating2024-02-09T14:20:59.776Z

We are currently investigating an issue with Interactions not launching successfully within the Learning Application.

Nov 15, 2023

Report: "Downgraded Performance - Boost Dashboard"

Last update 2023-11-15T14:42:20.501Z

resolved2023-11-15T14:42:20.486Z

This incident has been resolved.

monitoring2023-11-15T03:51:53.064Z

We've applied a solution and are monitoring the ongoing progress; the completion of the process is expected to take a few hours. We will provide further updates.

identified2023-11-14T22:42:33.683Z

We are actively working on a process of implementing a solution to fix the issue. This process will take time and we will provide further status updates.

investigating2023-11-14T17:28:36.938Z

We are continuing our investigation and are actively working on a solution to resolve the issue. We will provide further updates.

investigating2023-11-14T15:03:15.063Z

We are currently investigating an issue with our Boost Dashboard application. The issue does not impact any learner facing applications or other admin applications.

Nov 9, 2023

Report: "Downgraded Performance"

Last update 2023-11-09T18:30:06.810Z

resolved2023-11-09T18:30:06.797Z

This incident has been resolved.

monitoring2023-11-09T17:31:33.792Z

We are continuing to monitor the fix, load times within learning and admin applications have returned to normal performance.

monitoring2023-11-09T16:09:24.464Z

We have identified an issue with downgraded database performance causing intermittent slowness in the learning application and admin applications. A fix has been identified and implemented and we are monitoring progress.

Oct 25, 2023

Report: "Degraded Performance - Console"

Last update 2023-10-25T16:40:23.759Z

resolved2023-10-25T16:40:23.744Z

This incident has been resolved.

monitoring2023-10-25T16:24:13.499Z

A fix has been implemented and we are monitoring the results.

investigating2023-10-25T15:54:58.914Z

We are currently investigating an issue with our Console Admin application which is affecting a subset of admin users. This does not impact the learner facing applications.

Oct 12, 2023

Report: "Degraded Performance - Console."

Last update 2023-10-12T19:41:36.411Z

resolved2023-10-12T19:41:36.397Z

This issue has been resolved.

monitoring2023-10-12T19:11:24.115Z

A fix has been implemented and we are monitoring the results.

investigating2023-10-12T18:59:24.331Z

We are currently investigating an issue with our Console Admin application which is affecting a subset of admin users. This does not impact the learner facing applications.

Sep 21, 2023

Report: "Degraded Performance - Console"

Last update 2023-09-21T19:51:36.538Z

resolved2023-09-21T19:51:36.521Z

This incident has been resolved.

monitoring2023-09-21T19:29:37.319Z

A fix has been implemented and we are monitoring the results.

identified2023-09-21T18:55:41.723Z

The root cause has been identified and we are working on deploying a fix.

investigating2023-09-21T18:44:04.352Z

We are currently investigating an issue with our Console Admin application which is affecting a subset of admin users. This does not impact the learner facing applications.

Sep 9, 2023

Report: "Downgraded Performance - Boost Dashboard"

Last update 2023-09-09T12:06:37.164Z

resolved2023-09-09T12:06:37.147Z

This issue has been resolved as of last night at 21:00:24 CDT.

monitoring2023-09-08T18:51:12.013Z

The root cause has been identified, we have implemented a fix and are monitoring progress.

investigating2023-09-08T18:34:16.753Z

We are currently investigating an issue with our Boost Dashboard application. The issue does not impact any learner facing applications or other admin applications.

Jul 27, 2023

Report: "Downgraded Performance - Reporting Dashboards in Console"

Last update 2023-07-27T01:48:10.889Z

resolved2023-07-27T01:48:10.874Z

The issue has been resolved and all of the Dashboards accessible via Console are performing as expected.

identified2023-07-26T23:48:20.579Z

We are continuing to work on a resolution with our vendor. We will provide further status updates.

identified2023-07-26T20:22:09.000Z

The root cause has been identified and we are working on a resolution with one of our vendors. We will provide further status updates.

investigating2023-07-26T20:03:55.000Z

We are currently investigating an issue with data displayed in User, Branch/Group and Branch Summary dashboards accessible via Console. Until the issue is identified and resolved, the dashboards will not display any information. This does not impact learner facing application and Boost Dashboards remain available.

Mar 13, 2023

Report: "BenchPrep Admin Support Center - Intermittent issues"

Last update 2023-03-13T17:53:35.578Z

resolved2023-03-13T17:53:35.564Z

The issue has been resolved. Our Ticketing System is fully functional.

investigating2023-03-13T16:19:21.081Z

We are currently experiencing intermittent issues with our Ticketing System & Knowledge Base - https://support.benchprep.com/home/ You can submit tickets directly via email: help@benchprep.com This DOES NOT affect any of Administration Tools or end user Learning Applications.

Jan 17, 2023

Report: "Snowflake - Degraded Performance"

Last update 2023-01-17T04:05:01.209Z

resolved2023-01-17T04:05:01.196Z

This incident has been resolved. Data has been fully resynced and is performing regular scheduled syncs as of 18:24:32 CST.

investigating2023-01-15T19:50:54.697Z

We continue monitoring the resynchronization process and are seeing progress. We will continue to provide further updates.

investigating2023-01-15T01:03:59.239Z

We have started a process of resynchronization of the data and we are seeing signs of progress. We will continue monitoring the process and will provide further updates.

investigating2023-01-14T19:24:56.863Z

We are continuing to investigate this issue with our 3rd party vendor. Additionally, we are actively seeking alternate means of refreshing the data. We will provide further updates.

investigating2023-01-13T22:53:26.575Z

We have confirmed data has not successfully synced since 2023-01-12 19:42 PM CST. We are working on a process of implementing a solution to replicate the data. This process will take time and we will provide further status updates.

investigating2023-01-13T20:19:32.125Z

We are currently investigating an issue with Snowflake data replication with our 3rd party vendor. This does not affect any administration or end user applications and the impact is isolated to Snowflake raw data access.

Jan 13, 2023

Report: "Degraded Performance"

Last update 2023-01-13T01:56:13.589Z

resolved2023-01-13T01:56:13.573Z

This incident has been resolved.

monitoring2023-01-13T01:22:17.351Z

We have disabled BDR database extensions and have restored connectivity. We will continue to monitor for the time being.

investigating2023-01-13T00:38:34.889Z

We are currently investigating an issue with our backend database system and have put the site into maintenance mode for the time being.

Sep 12, 2022

Report: "Snowflake - Degraded Performance"

Last update 2022-09-12T12:07:06.194Z

resolved2022-09-12T12:07:06.176Z

This incident has been resolved. Data has been fully resynced and is performing regular scheduled syncs as of 5:42 am CST.

monitoring2022-09-10T14:31:11.407Z

The resynchronization process is continuing and we are monitoring progress. Majority of the data has been resynced.

monitoring2022-09-10T00:22:11.746Z

A fix has been implemented. We are monitoring the results and will provide further updates.

identified2022-09-09T20:26:30.166Z

The issue has been identified and we are working on implementing a solution to replicate the data. This process will take time and we will provide further status updates.

investigating2022-09-09T16:01:34.936Z

We are currently investigating an issue with Snowflake data replication with our 3rd party vendor. Data has not been successfully synced since 22:41 CST. This does not affect any administration or end user applications and the impact is isolated to Snowflake raw data access.

Sep 2, 2022

Report: "Degraded Performance"

Last update 2022-09-02T20:13:10.974Z

resolved2022-09-02T20:13:10.960Z

The incident has resolved, we will continue to monitor.

monitoring2022-09-02T20:03:53.924Z

A fix has been implemented and we are monitoring the results.

identified2022-09-02T19:50:47.303Z

The issue has been identified and a fix is being implemented.

investigating2022-09-02T19:35:49.829Z

We are currently investigating reports of degraded performance.

Aug 24, 2022

Report: "Degraded Performance"

Last update 2022-08-24T01:44:07.220Z

resolved2022-08-24T01:44:07.207Z

We found regression in api reporting request that was causing significant amount of memory consumption impacting the nodes and entire system. The fix was been deployed and verified.

monitoring2022-08-23T22:26:48.732Z

We have confirmed performance has stabilized and we continue monitoring affected applications. We are planning to introduce significant improvements to our progress cluster during the next database maintenance.

investigating2022-08-23T18:41:49.547Z

The rollback was completed and services have been restored. We are continuing to investigate this issue.

investigating2022-08-23T18:35:54.408Z

The restart did not alleviate the issues. We are restoring the services and rolling back the changes.

investigating2022-08-23T18:17:15.988Z

We are conducting a short restart of services which can cause non-learner applications to be unavailable for a few minutes.

investigating2022-08-23T18:13:36.333Z

We are experiencing an issue with high load times within our non-learner BenchPrep applications. Learner applications are not affected. We are currently provisioning more resources and will continue to investigate.

Apr 19, 2022

Report: "Elevated API Errors"

Last update 2022-04-19T20:42:42.328Z

resolved2022-04-19T20:42:42.309Z

This incident has been resolved.

monitoring2022-04-19T19:10:13.553Z

We are continuing to monitor for any further issues.

monitoring2022-04-19T18:22:23.965Z

We are continuing to monitor and verifying that all our data centers are healthy.

monitoring2022-04-19T17:58:15.883Z

We are continuing to monitor for any further issues. The mitigation steps have been applied and the system is operational.

monitoring2022-04-19T17:27:50.805Z

A fix has been implemented and we are monitoring the results.

identified2022-04-19T16:55:36.201Z

We are working on deploying a fix shortly to mitigate the issue.

identified2022-04-19T16:23:44.921Z

We are continuing to work on a fix for this issue.

identified2022-04-19T15:55:42.384Z

We will be performing emergency maintenance.

identified2022-04-19T15:33:49.027Z

We are continuing to work on a fix for this issue.

identified2022-04-19T15:10:21.751Z

The issue has been identified and a fix is being implemented.

investigating2022-04-19T14:55:55.518Z

We're experiencing an elevated level of API errors and are currently looking into the issue.

Mar 28, 2022

Report: "Degraded Performance"

Last update 2022-03-28T21:01:19.981Z

resolved2022-03-28T21:01:19.964Z

We are not seeing Ingress pod restarts after increasing pod count in production.

monitoring2022-03-28T19:59:54.228Z

We increased the number of pods on Ingress to support the current traffic.

investigating2022-03-28T19:35:29.954Z

We are currently investigating degraded performance issues with our ingress pods.

Jan 4, 2022

Report: "Degraded Performance with Redis Cluster"

Last update 2022-01-04T16:17:22.180Z

resolved2022-01-04T16:17:22.161Z

This incident has been resolved.

monitoring2022-01-04T16:07:26.023Z

We are continuing to monitor for any further issues.

monitoring2022-01-04T16:02:44.195Z

Upgraded Redis Cluster was rolled out, we will monitor the situation now

identified2022-01-04T15:52:55.705Z

We are rolling out upgrades to our Redis cluster and will put the site in maintenance for an estimate of 5-10 minutes.

identified2022-01-04T15:34:34.111Z

We are having issues with our Redis Cluster, we have identified the problem and rolling out the temporary solution

Report: "Degraded Performance - login"

Last update 2022-01-04T01:04:11.125Z

resolved2022-01-04T01:04:11.109Z

Redis fix is applied and the issue is resolved now. We continue to monitor the application.

monitoring2022-01-04T00:28:56.904Z

We were able to successfully failover the Redis nodes and everything looks good, Now traffic serves to both data centers and monitoring the applications. We will keep monitoring and will update the final status.

identified2022-01-04T00:03:36.119Z

Redis replication restores completed and failover the Redis services to different node now.

identified2022-01-03T23:18:29.035Z

We have been running on Dallas past few hours without any issues and performing full Redis replica restoration currently.

identified2022-01-03T21:22:31.768Z

We restarted our Redis instance and traffic moved to the Dallas data center. Still working on fix.

identified2022-01-03T20:47:20.041Z

The issue is identified and we are in the process of fixing it.

investigating2022-01-03T20:30:38.811Z

We are currently investigating login issues. We will keep posted on updates.

Dec 27, 2021

Report: "Localization issue on Login pages"

Last update 2021-12-27T13:34:03.251Z

resolved2021-12-27T13:34:03.231Z

Verified that issue was with the incorrect locales updated from the script.

monitoring2021-12-27T11:54:09.599Z

A fix for locale crash have been implemented, services should be back - we are still looking at the issue at actual localization for select tenants

investigating2021-12-27T11:47:10.717Z

There is an issue with our localization impacting some tenants login pages, we are still investigating.

investigating2021-12-27T11:29:57.367Z

We are currently investigating degraded performance of the login pages

Nov 2, 2021

Report: "Degraded Performance"

Last update 2021-11-02T02:44:24.901Z

resolved2021-11-02T02:44:24.884Z

No critical identified and seems normal.

monitoring2021-11-02T02:38:27.447Z

We are still monitoring and checking all the possibilities for this issue.

monitoring2021-11-02T01:14:16.053Z

Restarted Redis nodes in the production cluster. Currently monitoring the system.

investigating2021-11-02T01:06:10.303Z

We are currently investigating the issue.

Nov 1, 2021

Report: "SSO service is down"

Last update 2021-11-01T17:07:40.237Z

resolved2021-11-01T17:07:40.221Z

monitoring2021-11-01T16:13:50.595Z

Scaled resources came back. We are monitoring and will actively work to identify the cause

identified2021-11-01T16:13:13.562Z

We are continuing to work on a fix for this issue.

identified2021-11-01T16:12:12.751Z

We are scaling the resources up to ensure new images are brought up

investigating2021-11-01T15:58:52.929Z

We are continuing to investigate this issue.

investigating2021-11-01T15:58:10.351Z

Following rolling production deployment, sso service failed to come back up. We are actively investigating.

Oct 5, 2021

Report: "Degraded Performance in Dallas Cluster"

Last update 2021-10-05T20:00:32.618Z

resolved2021-10-05T20:00:32.601Z

We are moving issue to resolved, close monitoring didn't reveal anything out of the ordinary.

monitoring2021-10-05T15:01:26.114Z

We observed a network performance degrade, cleared out old connections which resolved the issue. We will be monitoring closely for the next hour.

investigating2021-10-05T14:09:09.007Z

We are continuing to investigate the issue, in the meantime all traffic successfully redirected to the healthy datacenter restoring operational activities

investigating2021-10-05T13:52:01.655Z

We are investigating the issue impacting our Dallas cluster, in the meantime redirecting the traffic to the healthy datacenter.

Sep 7, 2021

Report: "Degraded Performance"

Last update 2021-09-07T19:10:35.635Z

resolved2021-09-07T19:10:35.603Z

Attaching IBM incident https://cloud.ibm.com/status?item=INC4252245

monitoring2021-09-07T16:34:38.570Z

We are continuing to monitor for any further issues.

monitoring2021-09-07T16:31:37.846Z

Confirmed packet loss & networking issues with the cloud provider, services are back, but we will keep the incident open until we hear official confirmation that it is resolved.

identified2021-09-07T16:24:30.020Z

We are seeing network issues that we are actively working with the Cloud Provider on, in the meantime one datacenter is back, routed traffic to it and switch back to degraded performance

investigating2021-09-07T15:52:21.871Z

Updating to outage. Teams are looking into the issue

investigating2021-09-07T15:47:15.175Z

We are investigating degraded performance reported by a number of our applications pods

Aug 25, 2021

Report: "Issue with Course Building"

Last update 2021-08-25T22:02:01.107Z

resolved2021-08-25T22:02:01.077Z

The issue with course builds involving updates to questions has been identified and fixed. You may generate builds for courses with updates to questions. Additionally, our tech team is prioritizing work to optimize course build process to ensure a consistent course build experience. We encourage you to monitor course build progress and contact support if you experience repeated failed builds.

investigating2021-08-25T16:03:24.398Z

If you have made any updates to questions, please refrain from building the course until further update can be provided. If your changes do not involve updates to questions, you can generate the build.

investigating2021-08-25T15:23:23.453Z

BenchPrep is investigating issues related to course builds in BluePrint and advises all customers to hold off on building any courses until an update can be provided.

Aug 19, 2021

Report: "Degraded Performance"

Last update 2021-08-19T20:48:47.230Z

resolved2021-08-19T20:48:47.218Z

We have restored traffic and have removed the need to to connect to the IBM registry when pods restart.

identified2021-08-19T20:17:30.133Z

We have identified an issue with connecting to the IBM cloud registry from our San Jose Datacenter. We are routing all traffic to Dallas.

investigating2021-08-19T20:00:32.965Z

We are currently investigating reports of degraded performance.

Jun 25, 2021

Report: "Exam Results and BluePrint application"

Last update 2021-06-25T17:06:17.690Z

resolved2021-06-25T17:06:17.676Z

This incident has been resolved.

monitoring2021-06-25T16:37:51.762Z

A fix has been implemented and we are monitoring the results.

identified2021-06-25T15:01:11.578Z

An issue was identified in how exam results are being displayed to users. This is originating from the BluePrint application, which will be turned off pending resolution. A fix has been identified and is being worked on. Expected resolution time is 1 hour.

May 14, 2021

Report: "Degraded Performance"

Last update 2021-05-14T18:30:17.537Z

resolved2021-05-14T18:30:17.521Z

Several shared database connections were corrupted. Restarting our connection pooling software and connected services cleared out the connections. We will continue to monitor and provide updates if it reoccurs.

monitoring2021-05-14T17:43:07.417Z

We have identified the problem and are currently monitoring.

investigating2021-05-14T17:06:25.373Z

We are currently investigating reports of degraded performance.

Mar 25, 2021

Report: "Degraded Performance"

Last update 2021-03-25T14:54:08.541Z

resolved2021-03-25T14:54:08.524Z

This incident has been resolved.

monitoring2021-03-25T14:45:42.146Z

We brought up datacenters and identified the issue with the database contention. We cleared the contention and monitoring the situation

investigating2021-03-25T14:19:06.487Z

We are currently seeing issue with degraded performance impacting our San Jose cluster. We are redirecting traffic to another datacenter and investigating.

Feb 18, 2021

Report: "Degraded Performance"

Last update 2021-02-18T14:25:29.193Z

resolved2021-02-18T14:25:29.179Z

This incident has been resolved.

monitoring2021-02-18T14:24:51.663Z

We are continuing to monitor for any further issues.

monitoring2021-02-18T00:58:26.985Z

System performance has stabilized and we are continuing to monitor network connectivity.

investigating2021-02-17T20:14:01.656Z

We are working with our hosting service to investigate Dallas network issues. We will continue to monitor system performance while sending traffic to San Jose.

investigating2021-02-17T18:18:24.132Z

We are currently routing all traffic to San Jose while we continue to investigate potential network connectivity issues in Dallas.

investigating2021-02-17T17:08:57.075Z

We are experiencing slower than normal load times. We are currently investigating the issue and will post any relevant updates here.

Feb 13, 2021

Report: "Lagging reports in Boost"

Last update 2021-02-13T04:51:51.263Z

resolved2021-02-13T04:51:51.248Z

The reports have been generated. We will continue to monitor.

monitoring2021-02-13T01:33:04.025Z

We have restarted the report build process and will be monitoring until completion.

identified2021-02-13T01:14:34.264Z

We have identified an issue with one step of the report generation process and are deploying a work around.

investigating2021-02-12T20:24:03.203Z

We are currently investigating a delay in report generation in Boost.

Feb 11, 2021

Report: "Degraded Performance"

Last update 2021-02-11T21:46:13.262Z

resolved2021-02-11T21:46:13.243Z

Connectivity to both data centers has been restored and we are sending traffic to both locations now. We will continue to monitor the situation.

monitoring2021-02-11T19:54:29.679Z

We are continuing to monitor for any further issues.

monitoring2021-02-11T17:14:02.559Z

We have identified an issue with our database connection pooling software. Restarting it has temporarily resolved the issue, we are monitoring and looking for permanent solutions.

investigating2021-02-11T14:57:51.115Z

We are currently investigating an issue related to memory contention on one of our databases.

monitoring2021-02-10T19:06:19.709Z

We are currently routing all traffic to San Jose and are working with our infrastructure provider to address some network connectivity issues. All services are operational.

monitoring2021-02-10T18:18:50.807Z

We have identified an issue in our Dallas datacenter and have routed all traffic to San Jose for the time being. We will continue to investigate.

investigating2021-02-10T17:59:19.336Z

We are experiencing slower than normal load times and reports of pages unable to load. We are currently investigating the issue and will post any relevant updates here.

Feb 5, 2021

Report: "Degraded Performance"

Last update 2021-02-05T01:41:51.151Z

resolved2021-02-05T01:41:51.137Z

We have confirmed that there are no performance issues outside of some internal tools. We will be making some operational changes to address the performance regression and will continue to monitor the situation until it has been resolved.

monitoring2021-02-05T00:43:34.340Z

We experienced an issue with database deadlocks at approximately 11am CST. That issue has been resolved and we had received reports of lingering slowness in Blueprint. We are continuing to monitor performance of the platform.

investigating2021-02-04T21:33:37.362Z

We are experiencing slower than normal load times. We are currently investigating the issue and will post any relevant updates here.

Dec 11, 2020

Report: "Elevated API Errors"

Last update 2020-12-11T14:40:27.247Z

postmortem2020-12-11T14:37:52.038Z

**Date:** December 11, 2020 **Date of Incident**: December 10, 2020 **Raised by**: Internal Monitoring **Severity Level:** Critical **Description** * BenchPrep internal monitoring tools \(Pingdom / NewRelic / Airbrake\) alerted of site stability issues **Root Cause** * A DOS \(Denial of Service\) like behavior was detected from a small range of IPs **Resolution** BenchPrep Engineers blocked traffic from offending IP addresses **Mitigation Strategies** BenchPrep will be implementing automated DOS mitigation measures so that manual intervention will not have to occur. **Timeline** 2020-12-10 - 12:13 PM CST - First alerts triggered 2020-12-10 - 12:14 PM CST - BenchPrep begins investigating 2020-12-10 - 12:29 PM CST - BenchPrep blocks offending IP addresses 2020-12-10 - 12:30 PM CST - Site stability restored Total resolution time: 17 minutes

resolved2020-12-10T20:12:48.128Z

This incident has been resolved.

monitoring2020-12-10T18:55:26.784Z

Services returning to normal, we will continue monitoring.

investigating2020-12-10T18:42:00.402Z

We are continuing to investigate this issue.

investigating2020-12-10T18:24:46.204Z

We're experiencing an elevated level of API errors and are currently looking into the issue.

Dec 4, 2020

Report: "Elevated Errors Loading Ascend"

Last update 2020-12-04T22:58:40.662Z

postmortem2020-12-04T22:57:58.833Z

**Date:** December 4, 2020 **Date of Incident**: December 3, 2020 **Raised by**: Internal Monitoring **Severity Level:** High **Description** * BenchPrep internal monitoring tools \(Airbrake\) alerted of network connectivity issues with IBM Cloud Object Storage **Root Cause** * An expired SSL certificate was deployed at IBM Cloud Object Storage, this prevented requests made for cached content to fail. **Resolution **The issue was resolved before BenchPrep could deploy any temporary workarounds. **Mitigation Strategies **BenchPrep will build support for content to be pulled directly from our database in the event of connectivity issues in the future. **Timeline **2020-12-03 - 03:41 PM CST - First exception error reported 2020-12-03 - 03:51 PM CST - BenchPrep begins investigating 2020-12-03 - 04:12 PM CST - Connectivity Restored Total resolution time: 31 minutes

resolved2020-12-03T22:44:36.344Z

This incident has been resolved.

monitoring2020-12-03T22:18:17.333Z

Service has been restored, we are currently monitoring.

investigating2020-12-03T22:13:55.948Z

We are continuing to investigate this issue.

investigating2020-12-03T22:11:57.215Z

We are experiencing elevated errors when loading BenchPrep Ascend and BenchPrep Engage. We are currently investigating.

Jun 10, 2020

Report: "Degraded performance with IBM Cloud"

Last update 2020-06-10T00:58:32.553Z

resolved2020-06-10T00:58:32.540Z

While IBM is still working out other issues (https://cloud.ibm.com/status?selected=status) we are going to resolve this incident as all are services continue to stay stable.

monitoring2020-06-10T00:35:51.017Z

Since 19:02 CST we saw significant performance improvements of our services and response times. We are actively monitoring the situation at IBM (they have updated their status page https://cloud.ibm.com/status?selected=status ) and will keep you updated.

investigating2020-06-09T23:48:23.901Z

Here is what we know: * We have reached out to IBM and our representatives are aware of the situation * IBM is experiencing world-wide network outages (even impacting their status page https://cloud.ibm.com/status and support help desks) with various additional sources validating it (https://status.aspera.io/, https://downdetector.com/status/ibm-cloud, etc.) * It appears at this moment that issue is specific to IBM's public networking * Our services are not heavily impacted due to our architecture as well as the fact that IBM's private network traffic is healthy at the moment, we are confirming that we are still serving traffic in our application. We will keep updating this as we find out more

investigating2020-06-09T22:25:02.604Z

We are seeing some issues with IBM Cloud networking. We are reaching out to vendor for more information.

May 20, 2020

Report: "Elevated API Errors"

Last update 2020-05-20T23:07:40.410Z

postmortem2020-05-20T23:06:23.701Z

**Date:** May 21, 2020 **Date of Incident**: May 2, 2020 **Raised by**: System Monitoring **Severity Level:** High **Description **At 1:54pm CST BenchPrep staff was notified of site instability issues via system monitors. BenchPrep engineers soon began looking into container and cluster status. **Root Cause **A traffic spike caused memory usage of individual pods to creep up. This resulted in the host nodes running out of memory. Due to lack of available memory, the Kubernetes cluster was unable to restart new healthy pods, which resulted in the backend API service becoming unresponsive. **Resolution **BenchPrep engineers restarted all API pods, this restored stability during the investigation. **Mitigation Strategies: **We have added additional alerts on host nodes for when memory consumption is high. We are reviewing our deployed resources and scaling down less frequently used ones in order to reduce the per-node memory footprint. We are also investigating adding an additional node to each cluster for additional stability. **Timeline **01:54 PM CST - Notification of the site instability from Pingdom 02:07 PM CST - BenchPrep Engineers began troubleshooting 02:20 PM CST - BenchPrep platform put into maintenance mode during investigation 02:40 PM CST - BenchPrep platform stability restored and maintenance mode disabled Total resolution time: 46 minutes

resolved2020-05-02T20:02:36.213Z

We have identified an issue with high system memory utilization and have corrected it. We will continue to monitor the situation.

monitoring2020-05-02T19:40:29.010Z

A fix has been implemented and we are monitoring the results.

investigating2020-05-02T19:23:15.573Z

We're experiencing an elevated level of API errors and are currently looking into the issue.

Apr 21, 2020

Report: "Failed production database switch"

Last update 2020-04-21T19:37:19.801Z

postmortem2020-04-21T19:36:44.840Z

**Description** At 09:50 CDT, BenchPrep database connection pooling software lost connection to the backend database during an attempt to switch to a new database server. This in turn caused an outage of all platform services. The loss of database connection was noticed immediately, and the original database server was back online by 09:53 CDT. BenchPrep’s initial attempt to revert the switch back to the original server did not resolve the connection issue, however. In order to restore platform functionality, BenchPrep reconfigured applications to connect to the database directly, instead of via the connection pool. This configuration went into place at 10:15 CDT. While successful, it was not optimal, resulting in a minor intermittent outage. Final configuration restoration of all changes happened at 11:08 CDT with the all app switch at 11:30 CDT. **Root Cause** Investigation at this point revealed that the initial switchover attempt was missing a necessary change to a connection pool configuration file that would have corresponded to the changes made to the database server endpoint address. Additionally, initial configuration roll back was incomplete missing the expected connection port. **Resolution** BenchPrep engineers prepared a more thorough reversion of all configuration changes associated with the switchover, which was carefully reviewed and manually tested before any more application changes were made. This configuration was in place at 11:08 CDT and all platform services fully switched to it at 11:30 CDT. **Mitigation Strategies:** * Rather than replacing database connection parameters all at once, future connection pool configuration changes will be put in place in the environment alongside existing configuration, and platform applications will only switch after verifying that the new connection works. * Prior to any such configuration change, a rollback branch will be prepared ahead of time in case it is needed.

resolved2020-04-21T16:29:28.137Z

This incident has been resolved.

monitoring2020-04-21T15:23:31.151Z

At 9:50am CT as part of the routine database health improvements we have attempted a failover to a secondary cluster at which point we have lost the client side connection pooling. We have restored the service at 10:15am CT and actively monitoring the situation. Due to this next attempt will happened at the next regular scheduled downtime.

Jan 27, 2020

Report: "Degraded Performance"

Last update 2020-01-27T23:09:12.081Z

resolved2020-01-27T23:09:12.069Z

We received reports of slowness on the platform. We have identified and associated those reports to the time when we reloaded our database connection pooling software initiated at 2:22PM CST. Slowness was due to incremental rollout of new connection pods across multiple data centers.

investigating2020-01-27T22:46:18.286Z

We are currently seeing the degraded performance across the platform and are investigating.

Oct 22, 2019

Report: "IBM Cloud Node Outages"

Last update 2019-10-22T01:56:28.065Z

resolved2019-10-22T01:56:28.050Z

We are closing the issue - all BenchPrep services are restored albeit running through fewer datacenters.

monitoring2019-10-21T22:30:36.079Z

While IBM is dealing with the issue we have decided to direct all traffic off the impacted cluster. That means all services should be back to normal, we are monitoring the situation so that we can bring the location back once IBM is done.

investigating2019-10-21T21:26:48.162Z

P3 Incident being investigated: We are aware of an incident that IBM Cloud (our server provider) is experiencing which is causing intermittent loading issues in Blueprint & the Tenant Admin Dashboard. We are in communication with IBM and closely monitoring the impact to our system. Once IBM resolves the incident, service should resume as normal. We will communicate updates as we get them from IBM and you can also check IBM Cloud status page directly (specifically the "Node Outages" incident. This issue is sporadic and a refresh may correct it.

Aug 2, 2019

Report: "Elevated API Errors"

Last update 2019-08-02T18:41:41.043Z

postmortem2019-08-02T18:37:48.198Z

**Date:** 2019/08/02 **Date of Incident**: 2019/08/02 **Raised by**: BenchPrep **Severity Level:** High **Description** BenchPrep monitoring triggered alerts that the API servers/containers had lost connectivity to our Redis database service provided by IBM. ‌ **Root Cause** IBM experienced network connectivity issues between their services. This prevented both of BenchPrep’s Kubernetes clusters from connecting to the Redis backend service. For more information see IBM Incident ID: INC0999643 on their cloud status page [https://cloud.ibm.com/status](https://cloud.ibm.com/status) and search for the incident id. ‌ **Resolution** IBM Redis database services had recovered before we were able to migrate to the alternative service. **Mitigation Strategies:** We will maintain the alternative Redis instance provisioned and plan on migrating over to our own internally managed Redis solution. ‌ **Timeline** 10:04 AM CT - First email alerts came in 10:11 AM CT - BenchPrep engineers started provisioning alternative services 10:21 AM CT - Redis Services were fully restored ‌ A downloadable copy of this report can be found here: [https://drive.google.com/file/d/14p16oqEw\_SAaFOz-cXc8CY6gIbmwJKzV/view?usp=sharing](https://drive.google.com/file/d/14p16oqEw_SAaFOz-cXc8CY6gIbmwJKzV/view?usp=sharing)

resolved2019-08-02T17:55:30.447Z

IBM has resolved database connectivity issues.

monitoring2019-08-02T15:24:31.314Z

Connectivity has been restored to our provider's database service. We will continue to monitor the situation.

identified2019-08-02T15:19:06.732Z

One of our backend database providers is experience an outage with connectivity. We are provisioning an alternative and will attempt to migrate services.

investigating2019-08-02T15:08:37.622Z

We're experiencing an elevated level of API errors and are currently looking into the issue.

Aug 1, 2019

Report: "Elevated API Errors"

Last update 2019-08-01T22:18:08.508Z

postmortem2019-08-01T21:48:54.832Z

**Date:** 2019/07/31 **Date of Incident**: 2019/07/31 **Raised by**: BenchPrep **Severity Level:** High ‌ **Description** BenchPrep monitoring triggered alerts that the API servers/containers had lost connectivity to database services. BenchPrep began looking into the issue and discovered that PostgreSQL connection pooling services were unable to authenticate with the backend database. After ensuring the correct configuration was in place and restarting impacted database pooling services, successful connections were established. ‌ **Root Cause** Our database connection pooling software \(PgBouncer\) ran into a known but rare bug. When new connections were opened, an extra connection parameter was sent. The version of Postgres we are running isn’t compatible with that parameter. This resulted in the "invalid server parameter" error and insufficient valid connections were available. Reference: [https://pgbouncer-general.pgfoundry.narkive.com/lZPDYkqn/pgbouncer-1-1-released](https://pgbouncer-general.pgfoundry.narkive.com/lZPDYkqn/pgbouncer-1-1-released) ‌ **Resolution** Restarting our connection pooling software caused new connections to the backend database to be re-established. This returned stability to the site. ‌ **Mitigation Strategies:** We will be adding additional log level alerts to preemptively alert when connection issues start arising. This should allow us to manually intervene before a catastrophic failure occurs. ‌ **Timeline** 04:10 PM CT - First error notification came in 04:14 PM CT - API based services reached 100% error rate and site went down 04:22 PM CT - Partial availability restored 04:25 PM CT - Services fully restored ‌ A downloadable copy of this report can be found here: [https://drive.google.com/file/d/1m\_5EJsnJ8udRuk00bgJbSTtqTpVXrvPW/view?usp=sharing](https://drive.google.com/file/d/1m_5EJsnJ8udRuk00bgJbSTtqTpVXrvPW/view?usp=sharing)

resolved2019-07-31T21:39:12.404Z

This incident has been resolved.

monitoring2019-07-31T21:28:09.885Z

We have identified an issue with database connectivity and have corrected it. We will continue to monitor the situation.

investigating2019-07-31T21:17:37.548Z

We're experiencing an elevated level of API errors and are currently looking into the issue.

Apr 11, 2019

Report: "Outage - Redis"

Last update 2019-04-11T16:44:59.487Z

resolved2019-04-11T16:44:59.469Z

This incident has been resolved.

monitoring2019-04-11T16:27:46.006Z

We rolled out Redis backup services. The platform should be operational now, we are continuing to monitor and verify it.

identified2019-04-11T16:19:13.269Z

We went to a competing redis offering and deploying that shortly.

identified2019-04-11T16:00:47.202Z

While are we waiting for IBM, we are provisioning our backup services. Once they will be up we should be able to move fairly fast and service should restore shortly

identified2019-04-11T15:40:19.693Z

Both compose connectors available to us are not working. We have escalated the issue with IBM to make sure new connection is available as soon as possible

identified2019-04-11T15:24:02.052Z

The issue has been identified to the lack of connection strings related to the IBM Compose maintenance https://status.compose.com/. We are updating it which should bring services back shortly.

investigating2019-04-11T15:18:30.857Z

We have received an alert from our redis cluster, this will cause degraded performance for backend services. We are investigating and working on remediation.

Oct 26, 2018

Report: "Degraded performance"

Last update 2018-10-26T19:35:14.933Z

resolved2018-10-26T19:35:14.918Z

This incident has been resolved

monitoring2018-10-26T18:57:38.451Z

The issue was identified, fixed and we are currently doing verification and monitoring

investigating2018-10-26T18:55:03.348Z

We are currently seeing the degraded performance in the database cluster