Historical record of incidents for Calibre
Report: "Application unavailable"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Application unavailable"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Application outage"
Last updateThis incident has been resolved.
We are continuing to monitor while application availability is restored.
We are currently investigating application unavailability.
Report: "Test processing delays in all regions"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
A fix has been implemented. We are monitoring as systems recover.
We are currently investigating test processing delays in all testing regions. You might experience recent Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews and API-invoked tests). We will provide more information as soon as it’s available.
Report: "Test failures"
Last updateOn Friday 6th of October, Calibre experienced intermittent test failures on desktop devices. These failures were reported as being unable to connect to Chrome web browser. The issue has since been resolved and is not expected to be observed again in the future.
Report: "Test delays in Canada region"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating test delays in the Canada region. You might experience some Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews tests). We will provide more information as soon as it’s available.
Report: "Test Processing Delay"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating a delay in tests being processed.
Report: "Test delays in North Virginia region"
Last updateThis incident has been resolved.
There is a significant backlog of tests. We've increased Test Agents in the North Virginia region and are monitoring as queues return to normal.
We are currently investigating test delays in the North Virginia region. You might experience some Snapshots being marked as "scheduled" and results being delayed (including Pull Request Reviews tests). We will provide more information as soon as it’s available.
Report: "API requests are affected by an upstream DNS issue"
Last updateThe upstream DNS issue has been resolved. We've observed traffic to api.calibreapp.com and can confirm that the issue is resolved.
api.calibreapp.com is now partially resolving. We're continuing to monitor for the moment and will resolve this issue when we've seen confirmation from the upstream provider.
An upstream DNS issue is preventing API requests (Node.js API, Command Line Client) made to api.calibreapp.com from resolving correctly. We are tracking this issue with our upstream provider and will provide updates as they come.
Report: "Application is not accessible"
Last updateDue to a reported incident with our application hosting provider, Calibre’s application interface was not accessible for approximately 30 minutes between 3:30 and 4:00 AM AEST (16:30 and 17:00 UTC). Testing was NOT affected and the application was fully operational after the incident window.
Report: "Degraded application performance"
Last updateThis incident has been resolved.
The fix has been applied. We'll continue to monitor.
We are currently experiencing an outage: Application and APIs are not accessible. Test are running in the background and are not affected by this issue. We are working on a fix and will post updates as they come.
We've identified the issue and are currently applying a fix.
We are currently investigating elevated error reports for Calibre's user interface and APIs.
Report: "Temporary test unavailability and 1.5 hrs of downtime"
Last updateOn December 3rd, [Calibreapp.com](http://calibreapp.com) suffered approximately **1 hour 30 minutes of downtime following difficulties during a routine data migration** followed by a period of degraded performance. During the data migration, tests recorded prior to December 3rd were temporarily unavailable to view. New tests were being conducted, but delayed in aggregation due to the ongoing data migration and also temporarily unavailable. **No data was lost.** ## **Monday 3rd December, 4:25pm AEST** 45 minutes into the data migration we noticed drastically degraded Postgres database performance, which brought [Calibreapp.com](http://calibreapp.com) down for almost an hour. ## **Monday 3rd December, 5:30pm AEST** [Calibreapp.com](http://calibreapp.com) was brought back up while still experiencing degraded performance due to the migration load. ## **Monday 3rd December, 9:30pm AEST** A routine vacuum and automatic daily database backups started running and operating on the same table that was being migrated, which caused further issues. ## **Tuesday 4th December, 8:00am AEST** By Tuesday the migration had progressed to process data back to September 2018, which meant that timeline metrics were 100% available, but detailed reports of those tests were still unavailable for view. We continued to monitor the database. ## Wednesday 5th December, 7:27pm AEST Following numerous process efficiency fixes and replacing a database replica the remaining queue backlog was processed smoothly and the service came back to full availability.
This incident has been resolved.
Report: "Test infrastructure unavailable"
Last updateThis incident has been resolved.
A fix has been implemented and we're continuing to monitor the situation
The issue has been identified and we are issuing a fix
We are investigating a back log in testing in all regions
Report: "Application unavailable"
Last updateThis incident has been resolved.
We've experienced 10 minutes of downtime, but have been able to recover. We're continuing to monitor the situation.
Report: "App experiencing downtime"
Last updateThis incident has been resolved
We are currently investigating
Report: "Down time"
Last updateWe experienced 20 minutes of downtime due to a let’s encrypt configuration failure during routine maintenance
Report: "App experiencing downtime"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating an issue which is resulting in delays in test completion and intermittent downtime.
Report: "App not accessible"
Last updateDNS issues have been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating a DNS issue.
Report: "Website and documentation unavailable to certain visitors"
Last updatePartial unavailability of our website and documentation was caused by elevated error rates in the SFO1 region of our infrastructure provider. The issue persisted for 28 minutes (01:26 — 01:54 UTC) during which you might have experienced limited availability and timeouts of mentioned sites.
Website access is restored. We are further monitoring the availability to ensure the issue is fully resolved.
We are currently investigating a partial outage to our marketing site and documentation which might be unavailable to some visitors. We will provide more information as soon as we identify the issue. The Calibre application, tests and APIs are NOT affected—your monitoring is running uninterrupted.
Report: "Snapshots and test results delays"
Last updateSnapshots are currently processed normally. All past, pending test results have been released.
We identified an issue causing an increased load in releasing Snapshot test results. We have issued a fix and are watching the processing of test results recover. Your tests will be marked as completed as soon as they are processed.
We are currently investigating delays in processing test results and Snapshots. You might experience Snapshots being marked as "testing" for prolonged time. No test data has been lost. We will provide more information as soon as we identify the issue.
Report: "Some tests failing to run in selected regions"
Last updateWe have identified and addressed the issue behind some tests in some regions resulting in failures. All tests should now be running successfully. Past tests that failed will be re-run.
We are currently investigating test failures in selected regions. You might experience some Snapshots being marked as "failed". We’re working with our compute provider to resolve this issue. We will provide more information as soon as it’s available.
We are continuing to work on a fix for this issue.
We're working with our compute provider to resolve this issue.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Snapshot delays in three regions"
Last updateAll Test Agent regions are operating normally.
We are continuing to monitor for any further issues.
We identified an issue causing delays in testing in three Test Agent regions. We have issued a fix and are watching the test completion recover.
Report: "Snapshot failures in several regions"
Last updateAll Test Agent regions are now fully operational.
We have implemented a fix and are monitoring Snapshot completion rates.
We have narrowed down Snapshot issues to two Test Agent regions: London and North Virginia. We're continuing to investigate for the root cause and a solution.
We are currently investigating test failures and delays in selected regions. You might experience some Snapshots being marked as "failed" or results being delayed. We will provide more information as soon as it’s available.
Report: "Snapshot failures in several regions"
Last updateAll Test Agent regions are now fully operational.
We are observing recurrent test failures and delays in the North Virginia Test Agent region. We're continuing to investigate for the root cause of the issue and a permanent solution. Thank you for your patience.
A fix has been deployed. We are now monitoring with a view to resolve this issue soon.
We're continuing to investigate for the root cause and a solution for the issue. Rest assured that test results for failed tests won’t be lost and will become available once the incident is resolved.
We are currently investigating test failures and delays in selected regions and tests. You might experience some Snapshots being marked as "failed" or results being delayed. We will provide more information as soon as it’s available.
Report: "Test failures in several regions"
Last updateAll Test Agent regions are now fully operational.
We are still observing recurring test delays in several Test Agent regions. Tests in those areas might be delayed, but will eventually be completed. We continue working on a reliable solution to this issue.
A fix has been put in place and we will be monitoring our systems to confirm resolve.
We are continuing to investigate this issue.
We are investigating an ongoing issue where some tests are not able to be completed successfully. In some cases tests will fail, or be delayed for a period of minutes.
Report: "Tests delayed in London region"
Last updateThis incident has been resolved.
There is a backlog of tests in the London region. We are continuing to monitor the situation while the backlog is being processed.
Report: "Application is not accessible"
Last updateThis incident has been resolved.
The application interface is again accessible with the exception of users logging in with SAML (SSO). Again, testing is not affected. We continue working on full issue resolution. Thank you for your patience!
We are currently investigating an issue with the Calibre application (interface) not being accessible. Testing is NOT affected and running without issues. We will provide more information as soon as it’s available.
Report: "Test delays in North Virginia"
Last updateTests in North Virginia are now running normally.
We have identified the issue to be caused by an outage of one of our providers, Amazon AWS. A fix to bypass the outage is in place and tests should be resumed and completed shortly. Thank you for your patience.
We are currently investigating test failures and delays in the North Virginia region. You might experience some Snapshots being marked as "failed" or "scheduled" and results being delayed. We will provide more information as soon as it’s available.
Report: "Website and documentation partially unavailable"
Last updateThis incident has been resolved.
Our hosting provider has implemented a mitigation strategy—the marketing website and documentation should now be fully available without degraded performance issues. We will continue monitoring the situation to ensure stability of Calibre's sites.
Due to an outage of our hosting provider, Vercel, our marketing website and documentation might be unavailable to some visitors or you might observe downgraded performance. The Calibre application, tests and APIs are NOT affected—your monitoring is running without any issues. You can access your Calibre account at https://calibreapp.com/home.