Historical record of incidents for Datadog EU
Report: "Multiple components impacted by provider outage"
Last updateWe are currently investigating this issue.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are investigating delays in Distribution Monitors Evaluation, which began at 5:30pm UTC. Monitors for other types of metrics are evaluating as usual.
Report: "Delayed Monitors Notifications"
Last updateWe are investigating delays in Monitors Evaluation, which began at 5:30pm UTC.
Report: "Calls and SMS from Datadog On-Call impacted by telecom outage in Spain and Portugal"
Last updateThis incident has been resolved.
Messages and calls to Spanish phone numbers from Datadog On-Call are not being successfully sent due to an ongoing telecom outage in Spain and Portugal. Push notifications in the Datadog mobile app and emails are still being sent and received as normal. We are monitoring the situation and will update when new information becomes available.
Report: "Calls and SMS from Datadog On-Call impacted by telecom outage in Spain and Portugal"
Last updateMessages and calls to Spanish phone numbers from Datadog On-Call are not being successfully sent due to an ongoing telecom outage in Spain and Portugal. Push notifications in the Datadog mobile app and emails are still being sent and received as normal. We are monitoring the situation and will update when new information becomes available.
Report: "Delayed Cloud Network Monitoring Data"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are investigating increased latency processing Cloud Network Monitoring Data. As a result of this issue, some users may see delayed or missing data for Cloud Network Monitoring resources in the web application or API.
Report: "Delayed Cloud Network Monitoring Data"
Last updateWe are investigating increased latency processing Cloud Network Monitoring Data.As a result of this issue, some users may see delayed or missing data for Cloud Network Monitoring resources in the web application or API.
Report: "Metric query failures and delayed metric monitor evaluations"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating metric query failures and delays in metric Monitors evaluations, which began at 16h35UTC.
Report: "Metric query failures and delayed metric monitor evaluations"
Last updateWe are investigating metric query failures and delays in metric Monitors evaluations, which began at 16h35UTC.
Report: "Login Issues"
Last updateThis incident has been resolved.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are investigating user login issues related to reCAPTCHA for customers using password login. If you experience an issue with reCAPTCHA, refreshing the page can often mitigate the issue. Please note that data processing and alerts are not affected by this incident.
Report: "Login Issues"
Last updateThis incident has been resolved.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are investigating user login issues related to reCAPTCHA for customers using password login. If you experience an issue with reCAPTCHA, refreshing the page can often mitigate the issue. Please note that data processing and alerts are not affected by this incident.
Report: "Delayed Processing for a Subset of Metrics"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating an issue with incorrect or missing host tags on metrics associated to hosts created since 9am UTC. As a result of this issue, some users may see gaps or unexpected results when using products that rely on host tags -- such as dashboards, queries or alerts for those hosts.
Report: "Degraded Web Application Performance"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are continuing to work on a fix. Degraded web application performance is primarily observed in customers with low network bandwidth.
We have identified the underlying issue and are working on a fix.
We are investigating degraded performance with the web application.
Report: "Delayed APM data ingestion"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are investigating increased ingestion latency of APM data.
Report: "Delayed Metrics Monitor Evaluations"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We are investigating increased metrics based monitor delays for some customers.
Report: "Degraded Web Application Performance"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating degraded performance with the web application.
Report: "[SSO] Login Errors"
Last updateThis incident has been resolved. If you continue to see issues, please contact Datadog technical support.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
We have identified the issue and are implementing a fix.
We are investigating user login issues with the web application when using Okta SSO.
Report: "Delayed Processing for a Subset of Distribution Metrics"
Last updateThis incident has been resolved and the monitors have been re-enabled at this time.
We are continuing to monitor for any further issues.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We are investigating increased latency processing for a subset of Distribution Metrics. To prevent spurious alerts, we have temporarily disabled distribution monitors based on this data.
Report: "Web UI features maybe hidden"
Last updateThis incident has been resolved. Please refresh your Datadog web page to resolve the issue completely.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating an issue, that is causing certain features to be hidden from our UI. There is no data loss or monitoring impact.
Report: "Delayed logs for a subset of customers"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate the issue.
We are investigating increased latency processing logs for a subset of customers. As a result of this issue, some users may see delayed logs processing, and associated logs monitor are currently not evaluated to avoid false positives.
Report: "CI Visibility - Page Load issue"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have identified an issue that prevents some Software Delivery pages from loading. Also, Intelligent Test Runner, Quality Gates, GitHub PR comments and Static Analysis uploads are affected. The team is working on a fix.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating delays in monitors notifications specifically for monitors based on metrics distributions which began at 09:08 UTC.
Report: "Delayed Metrics Monitors Notifications"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are investigating delays in metrics based Monitors Notifications, which began at 5:40pm UTC.
Report: "We are investigating user login issues with the web application"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix.
We are investigating user login issues with the web application login by email. Please note that data processing and alerts are not affected by this incident.
Report: "Delays on multiple products"
Last updateThis incident has been resolved. This incident did not affect Metrics or Infrastructure Monitoring. Between 08:25 UTC and 09:15 UTC, we experienced elevated query errors and delays ingesting data for Logs, as well as more minor impact for several other products including: APM Traces, Real User Monitoring, Synthetics, and Audit Trail. All data has since been processed, and systems are operating in real-time as normal.
We are continuing to monitor for any further issues.
We've implemented a fix, and we are monitoring the result. This incident caused delays in several products, including Logs, APM Traces, Real User Monitoring, Synthetics Test Results, and Audit Trail.
We are investigating increased latency processing Events. As a result of this issue, some users may see delays or gaps in the event stream or for event queries on dashboards. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Partial outage of metrics query"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved. Monitors using Slack and Microsoft Teams are delayed. All other monitors are working correctly. It is important to note that no data has been lost, and notifications will be caught up once the service is operational again.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix.
We are investigating delays in Monitors Notifications, which began at 13:00 UTC.
Report: "Delayed Metrics"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix. Please note that the Datadog Agent and Client Libraries will buffer data or retry to avoid gaps in data submission.
We are investigating increased latency processing Metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. This is causing delays in Monitors Notifications, which began at 14:01 UTC.
Report: "Elevated Errors for API Key Validation"
Last updateFrom 12:45-1:15 PM US EST Datadog’s endpoint to validate Datadog API keys was unavailable. During this window Datadog Agents would be unable to validate their API key. In all cases Agents would continue to send data. Some Agents running in Kubernetes may be marked unhealthy until restarted. Newly started Agents would fail to start. Build jobs using our CI Visibility product would be missing custom tags and measures.
Report: "Elevated Error Rates for Log Queries and Monitors"
Last updateFix has been fully rolled out.
The fix rollout is currently ongoing. Once completed we will confirm resolution.
We have successfully tested a fix for this issue and are currently deploying it to resolve this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved. At this time, newly ingested data is properly queryable, and monitors targeting Logs sent from 2023-10-03 20:40 UTC onwards are valid. Queries targeting logs between 2023-10-02 11:40 UTC and 2023-10-03 20:40 UTC may return erroneous data. We are evaluating a fix that will restore query correctness for this time-window.
We have identified the underlying issue and are working on a fix.
We are continuing to investigate these issues, and will provide an update as soon as possible.
We are actively investigating issues with Log Queries returning unexpected results. As a result of this issue, some users may experience issues querying logs on the web application or API, and with Logs based Monitors.
Report: "Delayed Metrics and Logs"
Last updateThis incident has been resolved.
We have identified the problem, and are backfilling logs and metrics data. Monitors are still disabled, and no data has been lost.
We are investigating increased latency processing Metrics and Logs. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Delayed Synthetic Browser Test Results"
Last updateWe have scaled up the underlying system and we no longer observe latency in synthetic browser test results.
We have identified an issue that resulted in an increased latency executing Synthetics browser tests. As a result of this issue, some users may experience delays in receiving test results and notifications.
Report: "Monitors - Delayed Evaluations for Logs / APM / RUM Monitors"
Last updateThis incident has been resolved.
Monitor Evaluations for Logs/APM/RUM are delayed
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved. Delays may have been observed for a subset of Distribution Metrics Monitor notifications between 00:00 and 00:53 UTC.
We are investigating delays in Distributions Monitors Notifications, which began at 00:00 UTC.
Report: "Delayed Synthetics tests results"
Last updateBackfill is finished. This incident has been resolved.
All services are fully operational and processing live data. We have started to backfill Synthetics tests results and will provide another update once the backfills are finished.
We have deployed a fix and we are monitoring the results. It is important to note that no data has been lost, and it will be backfilled and available once the service is operational again. We will provide another update once the issue is fully resolved.
We have identified an issue that resulted in an increased latency processing Synthetics tests results and are working on a fix. As a result of this issue, some users may see delays with test results and in notifications based on this test data.
Report: "Web application performance degraded"
Last updateThis incident has been resolved.
We have identified the underlying issue and are working on a fix.
We are investigating loading issues on our web application. As a result, some users might be getting errors or degraded performance when loading the web application, specifically on dashboards.
Report: "Event monitors are not being evaluated"
Last updateThis incident has been resolved.
We have mitigated the underlying problem. All event monitors are now evaluating properly again.
We are currently investigating an issue where some event monitors maybe not be evaluated.
Report: "Backfilling historical data for March 8, 2023 incident"
Last updateWe have finished backfilling data across all products: all data received during the incident that had been successfully buffered but unprocessed, is now fully accessible on the platform. Due to the nature of this outage, you may see some residual gaps in the data we received within the first few hours after the start of the incident. We truly appreciate your patience and understanding during this incident.
We have completed backfill of data for the following products * Database Monitoring * Network Device Monitoring We are now in the process of validating and verifying data across all customers in those products. For other products, we are actively working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
We continue to work on backfilling data for additional products and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
We have completed backfill of data for the following products: * APM traces and services * Logs * Network Performance Monitoring * Profiling * RUM * CI Visibility and are now in the process of validating and verifying data across all customers in those products. For other products, we are actively working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
All Datadog services are now available and able to receive, query, and report on live data. Monitors continue to be evaluated correctly since live data has been restored. Some customers may still observe gaps in historical data for parts of the last 24 hours. We are now working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
All Datadog services are now available and able to receive, query, and report on live data. Monitors continue to be evaluated correctly since live data has been restored. Some customers may still observe gaps in historical data for parts of the last 24 hours. We are now working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
CI Visibility and Watchdog are operational. Monitors continue to be evaluated correctly since live data has been restored. Unless noted otherwise, all Datadog services are now available and able to receive and query live data. Some customers may still observe gaps in historical data for certain products for parts of the last 24 hours. We are now working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
Unless noted otherwise, all Datadog services are now available and able to receive and query live data. Some customers may still observe gaps in historical data for certain products for parts of the last 24 hours. We are now working on backfilling data and will provide updates every 2 - 3 hours until the backfill effort is complete and the incident is fully resolved.
We are continuing to work on a fix for this issue.
Security Monitoring is operational. SLOs are operational. Profiling recent data is available for queries. We will continue to monitor progress towards recovering the remaining services.
Logs Management is operational, live data and alerting are back to normal. External Archives and Log Forwarding are still delayed. Serverless monitoring is operational. We will continue to monitor progress towards recovering the remaining services.
APM Traces is fully operational. RUM is fully operational. We will continue to monitor progress towards recovering the remaining services.
Metrics generated from Logs are now available. We will continue to monitor progress towards recovering the remaining services.
Network Performance Monitoring is fully operational. Event Management is fully operational. Error Tracking is seeing partial availability, and we're investigating. We will continue to monitor progress towards recovering the remaining services.
The Synthetics product is fully operational. Metrics from our cloud provider integrations are fully operational. We will continue to monitor progress towards recovering the remaining services.
Monitors for Logs and Service Checks are operational. Database Monitoring is operational. We will continue to monitor progress towards recovering the remaining services.
Live data is now available for Logs. We will continue to monitor progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
We are continuing to work on a fix for this issue.
Live Search on last 15 mins for APM Traces is recovered. We're seeing partial recovery for Synthetics. We will continue to monitor progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Error Tracking is now fully operational. We're seeing partial recovery for Network Performance Monitoring. These products may have gaps in data and partial limitations based on data available to monitors. We will continue to monitor progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
We're seeing partial recovery across several products including SLOs, Profiling, WatchDog, Logs. These products may have gaps in data and partial limitations based on data available to monitors. We will continue to monitor progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Processes and their respective monitors, and Metrics are operational in EU1. Additionally APM Metrics and and associated monitors are operational. There may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Processes Monitors, APM Services and their respective monitors, and Metrics are operational in EU1. There may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
At 06:00 UTC on March 8th, 2023 the Datadog platform started experiencing widespread issues across multiple products and regions . The web application was unavailable or intermittently loading, and data ingestion & monitor evaluation were delayed. We will share a more detailed analysis post-recovery, but at a very high level: A system update on a number of hosts controlling our compute clusters caused a subset of these hosts to lose network connectivity As a result a number of the corresponding clusters entered unhealthy states and caused failures in a number of the internal services, datastores and applications hosted on these clusters. Our current status is: We identified and mitigated the initial issue, and rebuilt our clusters We also have recovered a number of our applications and services, including our web portals We are now working on recovering and catching-up the rest of our data systems for metrics, traces and logs across the regions that are still affected (see region-specific status pages). The recovery work is currently constrained by the number and large scale of the systems involved. What to expect next: We are focusing on bringing back live data for all customers and all products before catching-up on any historical data we may have stored during the outage We expect live data recovery in a matter of hours (not minutes, and not days) We will continue to issue regular updates as the situation unfolds We understand how critical Datadog is to your business, we sincerely apologize for the inconvenience and we are working hard to resolve this issue.
APM Services and their respective monitors are operational in EU1. Metrics are also operational in EU1, although there may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Metrics are now operational in EU1, although there may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Metrics are now operational in EU1, although there may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Metrics are now operational in EU1, although there may be gaps in historical metric data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Metrics are now operational in EU1. There may be gaps in historical metrics data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
Metrics are now operational in EU1. There may be gaps in historical metrics data. We continue progress towards recovering the remaining services. Data ingestion and monitor notifications remain delayed across non-metric data types.
We continue progress towards recovering all services. Data ingestion and monitor notifications remain delayed across all data types.
We are still progressing towards recovering all services. Data ingestion and monitor notifications remain delayed across all data types.
We are still progressing towards recovering all services. Data ingestion and monitor notifications remain delayed across all data types.
Some products are recovering and we are still progressing towards a complete recovery. Data ingestion and monitor notifications remain delayed across all data types.
Some products are recovering and we are still progressing towards a complete recovery. Data ingestion and monitor notifications remain delayed across all data types.
We are still working on the identified issue and are making continued progress towards recovery. Data ingestion and monitor notifications remain delayed across all data types.
We have identified the issue, and are making continued progress towards recovery. Data ingestion and monitor notifications remain delayed across all data types.
We are seeing reduced error rates for the web application. We are continuing to work on mitigating and investigating the issue causing delayed data ingestion across all data types. Monitor notifications are delayed, and you may observe delayed data throughout the app.
We are continuing to work on mitigating and recovering from the issue causing delayed data ingestion across all data types. Monitor notifications are delayed, and the web application continues to be unavailable.
We are continuing to work on mitigating and investigating the issue causing delayed data ingestion across all data types. Monitor notifications are delayed, and the web application continues to be unavailable.
We are continuing to investigate this issue.
We are still investigating issues causing delayed data ingestion across all data types. Monitor notifications may be delayed, and you may observe delayed data throughout the web app.
We are still investigating issues causing delayed data ingestion across all data types. Monitor notifications may be delayed, and you may observe delayed data throughout the web app.
We are investigating issues causing delayed data ingestion across all data types. As a result monitor notifications may be delayed, and you may observe delayed data throughout the web app.
We are investigating loading issues on our web application. As a result, some users might be getting errors when loading the web application.
Report: "GCP metrics delayed"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating an issue with our metrics collection from Google Cloud Platform. Metrics collected from the Google Cloud Platform may be delayed.
Report: "Delayed Metrics and Logs"
Last updateThis incident has been resolved.
We are investigating increased latency processing Logs and Metrics As a result of this issue, some users may see delays or gaps for data in their logs and metrics queries. To prevent spurious alerts, we may periodically disable monitors based on this data.
Report: "Degraded Web Application Performance"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating degraded performance with the web application.
Report: "Delayed Events"
Last updateThis incident has been resolved. Remaining data are being processed.
We are continuing to monitor for any further issues. Backfilling is still in progress.
A fix has been implemented and we are monitoring the results. Recent data are being processed normally, older data impacted by the incident are currently being backfilled.
We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost, and it will be backfilled and available once the service is operational again.
Report: "Delays on metrics ingestion"
Last updateThis incident has been resolved.
A fix has been implemented and we are recovering.
We are continuing to investigate this issue.
We are currently investigating delays on metrics ingestion of the following kind: distribution metrics, synthetics metrics, rum metrics, and logs pipelines and sensitive datascanner usage.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate the issue. Notifications are back to normal for all users, except for the ones sent to Microsoft Teams.
We are investigating delays in Monitors Notifications which impacts a subset of customers. It began at 07:10am UTC on 25th of Jan 2023.
Report: "[SSO] Login Errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating user login issues with the web application [via SSO]. We are investigating an issue causing the "Login with SAML" button to not appear for some users. While we work on a fix, users may contact support@datadoghq.com to get the correct link to log-in with SAML
Report: "Monitors - Delayed Evaluation"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating delays in Monitors Evaluation, which began 23:40 UTC
Report: "Delayed Metrics"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating increased latency processing Metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Delayed Metrics"
Last updateThis incident has been resolved.
As a result of this issue, some users may see delays or gaps in graphs that contain AWS metrics. We will provide another update once the issue is fully resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified an issue causing increased latency processing Metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled no data alerts based on this data This issue is causing increased latency processing Events. As a result of this issue, some users may see delays or gaps in the event stream or for event queries on dashboards.
We have identified the underlying issue and are working on a fix.
Report: "Elevated Error Rates for Metrics Queries"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost and it will be available once the service is fully operational.
We are actively investigating elevated error rates for Metrics Queries. As a result of this issue, some users may see errors with metrics graphs on the web application or API.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented, EU1 logs queries may still be slower than normal.
We are investigating delays in Logs Monitors Notifications that are affecting some customers, which began at 8:13 UTC
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating delays in Monitors Notifications, which began at 15:20 UTC.
Report: "Compliance Security Posture Management is partially unavailable"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating an issue preventing users from viewing findings aggregated by Rules or Ressources, as well as framework summary. Findings generation is not impacted.
Report: "Delayed Metrics"
Last updateThis incident has been resolved.
We are investigating increased latency processing Metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Delayed Metrics"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating increased latency processing Metrics. As a result of this issue, some users may see delays or gaps for metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Incorrect Event Titles"
Last updateThis incident has been resolved.
The issue has been corrected and new Events will show correct titles and data.
We are investigating an issue with incorrect Event titles. As a result of this issue, users may see delays or gaps in the event stream or for event queries on dashboards, as well as events with incorrect data. Corrected data will be backfilled after the incident is resolved. To prevent spurious alerts, we have temporarily disabled monitors based on this data.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have confirmed that metrics-based monitors are NOT impacted by this incident. Only events-based monitors are impacted.
We are continuing to work on a fix for this issue.
We are investigating delays in Monitors Notifications, which began at 14:00 UTC.