Calibre

Is Calibre Down Right Now? Check if there is a current outage ongoing.

Calibre is currently Operational

Last checked from Calibre's official status page

Historical record of incidents for Calibre

Report: "Application unavailable"

Last update
Resolved

This incident has been resolved.

Identified

The issue has been identified and a fix is being implemented.

Investigating

We are currently investigating this issue.

Report: "Application unavailable"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Application outage"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor while application availability is restored.

investigating

We are currently investigating application unavailability.

Report: "Test processing delays in all regions"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented. We are monitoring as systems recover.

investigating

We are currently investigating test processing delays in all testing regions. You might experience recent Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews and API-invoked tests). We will provide more information as soon as it’s available.

Report: "Test failures"

Last update
resolved

On Friday 6th of October, Calibre experienced intermittent test failures on desktop devices. These failures were reported as being unable to connect to Chrome web browser. The issue has since been resolved and is not expected to be observed again in the future.

Report: "Test delays in Canada region"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are currently investigating test delays in the Canada region. You might experience some Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews tests). We will provide more information as soon as it’s available.

Report: "Test Processing Delay"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating a delay in tests being processed.

Report: "Test delays in North Virginia region"

Last update
resolved

This incident has been resolved.

monitoring

There is a significant backlog of tests. We've increased Test Agents in the North Virginia region and are monitoring as queues return to normal.

identified

We are currently investigating test delays in the North Virginia region. You might experience some Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews tests). We will provide more information as soon as it’s available.

Report: "API requests are affected by an upstream DNS issue"

Last update
resolved

The upstream DNS issue has been resolved. We've observed traffic to api.calibreapp.com and can confirm that the issue is resolved.

monitoring

api.calibreapp.com is now partially resolving. We're continuing to monitor for the moment and will resolve this issue when we've seen confirmation from the upstream provider.

identified

An upstream DNS issue is preventing API requests (Node.js API, Command Line Client) made to api.calibreapp.com from resolving correctly. We are tracking this issue with our upstream provider and will provide updates as they come.

Report: "Application is not accessible"

Last update
resolved

Due to a reported incident with our application hosting provider, Calibre’s application interface was not accessible for approximately 30 minutes between 3:30 and 4:00 AM AEST (16:30 and 17:00 UTC). Testing was NOT affected and the application was fully operational after the incident window.

Report: "Degraded application performance"

Last update
resolved

This incident has been resolved.

monitoring

The fix has been applied. We'll continue to monitor.

identified

We are currently experiencing an outage: Application and APIs are not accessible. Test are running in the background and are not affected by this issue. We are working on a fix and will post updates as they come.

identified

We've identified the issue and are currently applying a fix.

investigating

We are currently investigating elevated error reports for Calibre's user interface and APIs.

Report: "Temporary test unavailability and 1.5 hrs of downtime"

Last update
postmortem

On December 3rd, [Calibreapp.com](http://calibreapp.com) suffered approximately **1 hour 30 minutes of downtime following difficulties during a routine data migration** followed by a period of degraded performance. During the data migration, tests recorded prior to December 3rd were temporarily unavailable to view. New tests were being conducted, but delayed in aggregation due to the ongoing data migration and also temporarily unavailable. **No data was lost.** ## **Monday 3rd December, 4:25pm AEST** 45 minutes into the data migration we noticed drastically degraded Postgres database performance, which brought [Calibreapp.com](http://calibreapp.com) down for almost an hour. ## **Monday 3rd December, 5:30pm AEST** [Calibreapp.com](http://calibreapp.com) was brought back up while still experiencing degraded performance due to the migration load. ## **Monday 3rd December, 9:30pm AEST** A routine vacuum and automatic daily database backups started running and operating on the same table that was being migrated, which caused further issues. ## **Tuesday 4th December, 8:00am AEST** By Tuesday the migration had progressed to process data back to September 2018, which meant that timeline metrics were 100% available, but detailed reports of those tests were still unavailable for view. We continued to monitor the database. ## Wednesday 5th December, 7:27pm AEST Following numerous process efficiency fixes and replacing a database replica the remaining queue backlog was processed smoothly and the service came back to full availability.

resolved

This incident has been resolved.

Report: "Test infrastructure unavailable"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we're continuing to monitor the situation

identified

The issue has been identified and we are issuing a fix

investigating

We are investigating a back log in testing in all regions

Report: "Application unavailable"

Last update
resolved

This incident has been resolved.

monitoring

We've experienced 10 minutes of downtime, but have been able to recover. We're continuing to monitor the situation.

Report: "App experiencing downtime"

Last update
resolved

This incident has been resolved

investigating

We are currently investigating

Report: "Down time"

Last update
resolved

We experienced 20 minutes of downtime due to a let’s encrypt configuration failure during routine maintenance

Report: "App experiencing downtime"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating an issue which is resulting in delays in test completion and intermittent downtime.

Report: "App not accessible"

Last update
resolved

DNS issues have been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating a DNS issue.

Report: "Website and documentation unavailable to certain visitors"

Last update
resolved

Partial unavailability of our website and documentation was caused by elevated error rates in the SFO1 region of our infrastructure provider. The issue persisted for 28 minutes (01:26 — 01:54 UTC) during which you might have experienced limited availability and timeouts of mentioned sites.

monitoring

Website access is restored. We are further monitoring the availability to ensure the issue is fully resolved.

investigating

We are currently investigating a partial outage to our marketing site and documentation which might be unavailable to some visitors. We will provide more information as soon as we identify the issue. The Calibre application, tests and APIs are NOT affected—your monitoring is running uninterrupted.

Report: "Snapshots and test results delays"

Last update
resolved

Snapshots are currently processed normally. All past, pending test results have been released.

monitoring

We identified an issue causing an increased load in releasing Snapshot test results. We have issued a fix and are watching the processing of test results recover. Your tests will be marked as completed as soon as they are processed.

investigating

We are currently investigating delays in processing test results and Snapshots. You might experience Snapshots being marked as "testing" for prolonged time. No test data has been lost. We will provide more information as soon as we identify the issue.

Report: "Some tests failing to run in selected regions"

Last update
resolved

We have identified and addressed the issue behind some tests in some regions resulting in failures. All tests should now be running successfully. Past tests that failed will be re-run.

identified

We are currently investigating test failures in selected regions. You might experience some Snapshots being marked as "failed". We’re working with our compute provider to resolve this issue. We will provide more information as soon as it’s available.

identified

We are continuing to work on a fix for this issue.

identified

We're working with our compute provider to resolve this issue.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Snapshot delays in three regions"

Last update
resolved

All Test Agent regions are operating normally.

monitoring

We are continuing to monitor for any further issues.

monitoring

We identified an issue causing delays in testing in three Test Agent regions. We have issued a fix and are watching the test completion recover.

Report: "Snapshot failures in several regions"

Last update
resolved

All Test Agent regions are now fully operational.

monitoring

We have implemented a fix and are monitoring Snapshot completion rates.

investigating

We have narrowed down Snapshot issues to two Test Agent regions: London and North Virginia. We're continuing to investigate for the root cause and a solution.

investigating

We are currently investigating test failures and delays in selected regions. You might experience some Snapshots being marked as "failed" or results being delayed. We will provide more information as soon as it’s available.

Report: "Snapshot failures in several regions"

Last update
resolved

All Test Agent regions are now fully operational.

monitoring

We are observing recurrent test failures and delays in the North Virginia Test Agent region. We're continuing to investigate for the root cause of the issue and a permanent solution. Thank you for your patience.

monitoring

A fix has been deployed. We are now monitoring with a view to resolve this issue soon.

investigating

We're continuing to investigate for the root cause and a solution for the issue. Rest assured that test results for failed tests won’t be lost and will become available once the incident is resolved.

investigating

We are currently investigating test failures and delays in selected regions and tests. You might experience some Snapshots being marked as "failed" or results being delayed. We will provide more information as soon as it’s available.

Report: "Test failures in several regions"

Last update
resolved

All Test Agent regions are now fully operational.

identified

We are still observing recurring test delays in several Test Agent regions. Tests in those areas might be delayed, but will eventually be completed. We continue working on a reliable solution to this issue.

monitoring

A fix has been put in place and we will be monitoring our systems to confirm resolve.

investigating

We are continuing to investigate this issue.

investigating

We are investigating an ongoing issue where some tests are not able to be completed successfully. In some cases tests will fail, or be delayed for a period of minutes.

Report: "Tests delayed in London region"

Last update
resolved

This incident has been resolved.

monitoring

There is a backlog of tests in the London region. We are continuing to monitor the situation while the backlog is being processed.

Report: "Application is not accessible"

Last update
resolved

This incident has been resolved.

identified

The application interface is again accessible with the exception of users logging in with SAML (SSO). Again, testing is not affected. We continue working on full issue resolution. Thank you for your patience!

investigating

We are currently investigating an issue with the Calibre application (interface) not being accessible. Testing is NOT affected and running without issues. We will provide more information as soon as it’s available.

Report: "Test delays in North Virginia"

Last update
resolved

Tests in North Virginia are now running normally.

monitoring

We have identified the issue to be caused by an outage of one of our providers, Amazon AWS. A fix to bypass the outage is in place and tests should be resumed and completed shortly. Thank you for your patience.

investigating

We are currently investigating test failures and delays in the North Virginia region. You might experience some Snapshots being marked as "failed" or "scheduled" and results being delayed. We will provide more information as soon as it’s available.

Report: "Website and documentation partially unavailable"

Last update
resolved

This incident has been resolved.

monitoring

Our hosting provider has implemented a mitigation strategy—the marketing website and documentation should now be fully available without degraded performance issues. We will continue monitoring the situation to ensure stability of Calibre's sites.

identified

Due to an outage of our hosting provider, Vercel, our marketing website and documentation might be unavailable to some visitors or you might observe downgraded performance. The Calibre application, tests and APIs are NOT affected—your monitoring is running without any issues. You can access your Calibre account at https://calibreapp.com/home.