Librato

Is Librato Down Right Now? Check if there is a current outage ongoing.

Librato is currently Degraded

Last checked from Librato's official status page

Historical record of incidents for Librato

Report: "Some alerts are delayed"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Hourly summarizations are delayed."

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues. Next update in 3 hours.

monitoring

A fix has been implemented and we are monitoring the results. Next update in 6 hours.

Report: "Increased error rate on Web and API"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Major outage from 18:03 to 18:40 UTC"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are currently investigating this issue.

Report: "Increased error rates and delayed alerts for composite metrics"

Last update
resolved

This incident has been resolved.

monitoring

Starting at 09:00 UTC On Monday Composite Metrics had increased error rates and delays in processing alerts on Composite Metrics.

Report: "Increased error rates from 02:38 - 02:42 UTC"

Last update
resolved

Increased error rates from the API and other services for several minutes.

Report: "SSL error connecting to api.heroku.com"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Alert or Screenshot integrations with slack are not delivered"

Last update
resolved

We are currently investigating this issue.

Report: "Alert processing errors"

Last update
resolved

From 01:33 to 02:17 UTC Alerts and metric ingestion were disrupted. Metrics sent by some agents may have gaps during that time and Absent Alerts may have fired incorrectly.

investigating

We are currently investigating this issue.

Report: "Delayed historical summaries when viewing spans of time 3 days long or longer"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "Delayed metrics"

Last update
resolved

This incident has been resolved.

identified

We continue to work on the metric processing pipeline.

monitoring

We continue to process the backlog. Most metrics and alerts should be back to normal behavior.

monitoring

We continue to process the backlog. Most metrics and alerts should be back to normal behavior.

identified

The issue has been identified and a fix is being implemented.

Report: "Increased Error rates"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Delays in processing metrics"

Last update
resolved

This incident has been resolved.

monitoring

The backlog has been processed and Alerts and Service Side Aggregated Metrics have returned to normal performance.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Metric ingestion is delayed"

Last update
resolved

This incident has been resolved.

monitoring

We are working on the backlog of ingested metrics.

identified

The issue has been identified and a fix is being implemented.

Report: "Service Side Metric Aggregation is delayed"

Last update
resolved

This incident affected: Service-Side Aggregated Metrics.

Report: "Delayed metrics from AWS Cloudwatch"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "The marketing site is down. Users can log in to Librato using https://metrics.librato.com/sign_in."

Last update
resolved

This incident has been resolved.

investigating

The marketing site is down. Users can log in to Librato using https://metrics.librato.com/sign_in.

Report: "Increased Errors on https://my.appoptics.com/"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "The home page is not accessible, use https://metrics.librato.com/users/sign_in"

Last update
resolved

This incident has been resolved.

monitoring

The provider we use for the home page had an incident, https://status.heroku.com/incidents/2402 that appears to now be resolved.

investigating

We are currently investigating this issue.

Report: "Alerts, and metrics were delayed from 12:10PM UTC - 2:20PM UTC"

Last update
resolved

This incident has been resolved.

monitoring

Traces, alerts, and metrics have returned to normal operation.

Report: "Metrics and Alerts processing is delayed"

Last update
resolved

This incident has been resolved.

monitoring

Alert processing has returned to normal.

identified

We are working on restoring full functionality to alerts.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Delayed alerts for small number of metrics (23:30 UTC on 22Apr2021 to 21:15 UTC 23Apr2021)"

Last update
resolved

This incident has been resolved.

monitoring

"A small number of alerts were delayed starting at 23:30 UTC on 22Apr2021. As of 21:15 UTC 23Apr2021 all delayed alerts have been delivered and alert processing has returned to normal."

Report: "Increased Error rates when viewing data in a 3 day window or longer"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "CloudWatch Message Ingestion Issue"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are investigating an issue with delayed metric ingestion for our CloudWatch integration. We will update here when more details are available.

Report: "A Fraction of Alerts Delayed"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate the root cause of the issue. Alerts continue to be delayed.

investigating

We are continuing to investigate the root cause of the issue. Alerts continue to be delayed.

investigating

Alerts are delayed. We are investigating the root cause of the issue.

investigating

Alerts may be delayed. We are investigating the root cause of the issue.

Report: "Heroku outage affecting Librato"

Last update
resolved

This incident has been resolved.

investigating

This issue will also cause lower throughput of logs from Heroku

investigating

An issue with Heroku is causing the site www.librato.com to be inaccessible. We are investigating.

Report: "Heroku outage affecting Librato"

Last update
resolved

This incident has been resolved.

investigating

An issue with Heroku is causing the site www.librato.com to be inaccessible. We are investigating.

Report: "Heroku Log Metrics disruption"

Last update
resolved

This incident has been resolved.

monitoring

We are now receiving Heroku logs at the expected volume. Heroku customers may see a gap in log coverage. The Heroku incident is still open so we'll continue to monitor.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "database issue"

Last update
resolved

This incident has been resolved.

monitoring

Normal operations have resumed and we continue to monitor the situation.

investigating

We are continuing to investigate this issue.

investigating

There is an identified issue with the database that is affecting performance. We are currently investigating. Please check back for updates.

Report: "spurious absent alerts and delayed alerts"

Last update
resolved

This incident has been resolved.

investigating

A database performance issue was identified and we have resolved it. We are working on scaling the database to avoid this problem in the future.

investigating

We are aware of an issue causing spurious absent alerts and delayed alerts. We are currently investigating this issue.

Report: "false alerts being generated"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "delayed AWS metrics"

Last update
resolved

This incident has been resolved.

investigating

We've recovered from the delay in CloudWatch metrics processing. You should see up-to-date metrics in Appoptics/Librato. Please contact support if you need further help.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Delays in processing metrics and metric alerts."

Last update
resolved

This incident has been resolved.

monitoring

Any spurious absent alerts should now be resolved and we continue to monitor the recovery.

investigating

Any spurious absent alerts should now be resolved and we continue to monitor the recovery.

investigating

Investigating Alerts service issue that is resulting in spurious absent alerts.

investigating

We are currently investigating this issue.

Report: "Issues with metric ingestion from Heroku"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Alerts processing delayed"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Delayed Alerts"

Last update
resolved

resolved, absent alerts did not fire from 22:57 to 23:23 UTC

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Site issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are aware of an issue and are investigating

Report: "Cloudwatch metrics delayed for us-west-2"

Last update
resolved

Cloudwatch metrics for us-west-2 are no longer delayed.

identified

Cloudwatch metrics from us-west-2 are delayed since 17:45 UTC due to an AWS API outage. Metrics will be backfilled when the API is accessible again.

Report: "Delayed Alerts"

Last update
resolved

Between 00:30 UTC and 01:45 UTC there were processing issues with the alerting pipeline that may have resulted in some alerts being accidentally triggered and/or delayed.

Report: "Investigating issues with page loading"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate the issue. Composite alerts may be delayed.

investigating

We are continuing to investigate the issue. Composite alerts may be delayed.

investigating

Some customers may be experiencing increased API latency or error rate at this time. We're continuing to work on this issue.

investigating

We are currently investigating issues with page loading

Report: "API Latency"

Last update
resolved

This incident has been resolved.

investigating

We have noticed a higher than normal latency when using the API. We are currently investigating the root cause.

Report: "API Latency"

Last update
resolved

API latency has returned to normal.

monitoring

API Latency appears to be returning to normal.

investigating

We have noticed a higher than normal latency when using the API. We are currently investigating the root cause.

Report: "Delayed collectd metrics"

Last update
resolved

This incident has been resolved.

identified

We are currently experiencing some delays in processing collectd metrics. Some users may experience delays in seeing metrics that have been posted via collectd.

Report: "Intermittent Heroku data ingestion"

Last update
resolved

Heroku data ingestion impacted by intermittent log parsing failures between 18:25-18:45 UTC; some measurements may be missing or partially-aggregated during this time. Service has been fully restored as of 18:45 UTC.

Report: "API calls failing"

Last update
resolved

At 22:38 UTC we experienced a database failover which briefly caused some API calls to fail. Within a few minutes the writer recovered, and all API resumed functioning as normal. Reading some data could have been delayed as replicas caught up. All effects appear to have resolved as of 22:55.

Report: "Increased error count"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Metrics Processing Delays"

Last update
resolved

This incident has been resolved.

identified

Metrics processing is currently delayed.

Report: "TLS 1.0 Deprecation"

Last update
resolved

TLSv1 support has been disabled. All public endpoints will require TLSv1.1 or greater. If you have any questions or need assistance in confirming support for your browsers or clients, contact us at support@librato.com.

monitoring

The TLS configuration has been reverted. We will be disabling support for TLS 1.0 on June 25th. The vast majority of HTTPS traffic already comes from clients that support TLS 1.2, but a small percentage of users may be affected by this deprecation. On June 25th we will disable TLS 1.0 support for Librato which may result in your browser or client no longer being able to interact with the website or the API. To avoid that, please make sure you support TLS 1.2. TLS 1.1 is supported but not recommended. The following minimum browser versions support TLS 1.2: - Microsoft Internet Explorer 11  - Microsoft Edge (all)  - Firefox 27 - 
Chrome 30 - 
Safari 7.0 (OS X 10.9)  - 
Mobile Safari 5.1 (iOS 5.1) - 
Opera 17 Any modern collection agents or libraries that uses SSL are all likely to be supported. You can check the TLS version number on an instance by running the following curl statement from the instance in question. curl --silent "https://www.howsmyssl.com/a/check" | grep -o -e '"tls_version":"[a-zA-Z0-9\ \.]*"' There will be a 'tls_version' field to look for. If you have any questions or need assistance in confirming support for your browsers or clients, contact us at support@librato.com.

monitoring

The new TLS configuration is now in place.

monitoring

We are starting a brown out for TLS 1.0 deprecation for approximately 1 hour starting at 12pm Pacific. If you are receiving high rates of HTTP status 500s during this period, the agents forwarding messages may be using TLS 1.0. We will be completing the deprecation process on June 25th. If you have any questions or need assistance in confirming support for your browsers or clients, contact us at support@librato.com.

monitoring

We will be disabling support for TLS 1.0 on June 25th. The vast majority of HTTPS traffic already comes from clients that support TLS 1.2, but a small percentage of users may be affected by this deprecation. On June 25th we will disable TLS 1.0 support for Librato which may result in your browser or client no longer being able to interact with the website or the API. To avoid that, please make sure you support TLS 1.2. TLS 1.1 is supported but not recommended. There will be a one hour test on June 18th at 12pm PST. The following minimum browser versions support TLS 1.2: Microsoft Internet Explorer 11 Microsoft Edge (all) Firefox 27 Chrome 30 Safari 7.0 (OS X 10.9) Mobile Safari 5.1 (iOS 5.1) Opera 17 Any modern collection agents or libraries that uses SSL are all likely to be supported. You can check the TLS version number on an instance by running the following curl statement from the instance in question. curl --silent "https://www.howsmyssl.com/a/check" | grep -o -e '"tls_version":"[a-zA-Z0-9\ \.]*"' There will be a 'tls_version' field to look for. If you have any questions or need assistance in confirming support for your browsers or clients, contact us at support@librato.com.

Report: "Composite Alerts - False Positives"

Last update
resolved

Between 17.44 and 17:55 UTC there was a brief period where a delay in measurement processing may have caused alerts that used composites to trigger unwarranted. The situation quickly resolved itself as the delayed service continued normal operation.

Report: "Increased API errors"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.