Historical record of incidents for Datadog AP1
Report: "Login Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are investigating user login issues related to reCAPTCHA for customers using password login. If you experience an issue with reCAPTCHA, refreshing the page can often mitigate the issue. Please note that data processing and alerts are not affected by this incident.
Report: "Login Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are investigating user login issues related to reCAPTCHA for customers using password login. If you experience an issue with reCAPTCHA, refreshing the page can often mitigate the issue. Please note that data processing and alerts are not affected by this incident.
Report: "Web Application Not Loading"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are fixing an issue which caused the web application to not load.
Report: "Degraded Web Application Performance"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are continuing to work on a fix. Degraded web application performance is primarily observed in customers with low network bandwidth.
We have identified the underlying issue and are working on a fix.
We are investigating degraded performance with the web application.
Report: "Metric Monitors are not being evaluated"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Metric monitors for some customers are not being evaluated and they may see their monitors not being triggered. We are actively investigating and will update as we find progress.
Report: "Web Application / Monitor Notification Issues"
Last updateThis incident has been resolved.
We have identified the underlying issue and are working on a fix.
We are investigating loading issues on our web application. As a result, some users might be unable to load the web application. We are also investigating delays in monitor notifications, which began at 14:08 UTC.
Report: "Web UI features maybe hidden"
Last updateThis incident has been resolved. Please refresh your Datadog web page to resolve the issue completely.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating an issue, that is causing certain features to be hidden from our UI. There is no data loss or monitoring impact.
Report: "Metric monitor evaluations and Log Archives degraded"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are experiencing issues with processing cloud integrations which is resulting in delayed integration metrics for aws.* metrics and issues with Logs Archives. We have disabled notifications relying on these metrics. We are investigating the issue and will provide additional information as it becomes available.
Report: "Delayed Events"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost, and it will be backfilled and available once the service is operational again.
We are investigating increased latency processing Events. As a result of this issue, some users may see delays or gaps in the event stream or for event queries on dashboards.
Report: "CI Visibility - Page Load issue"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have identified an issue that prevents some Software Delivery pages from loading. Also, Intelligent Test Runner, Quality Gates, GitHub PR comments and Static Analysis uploads are affected. The team is working on a fix.
Report: "Delayed monitor notifications for distribution metrics"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We are investigating delays in monitors notifications for monitors based on distribution metrics, which began at 3:50PM UTC.
Report: "We are investigating user login issues with the web application"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating user login issues with the web application login by email. Please note that data processing and alerts are not affected by this incident.
Report: "NPM Analytics Page Outage"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "Elevated Errors for API Key Validation"
Last updateFrom 12:45-1:15 PM US EST Datadog’s endpoint to validate Datadog API keys was unavailable. During this window Datadog Agents would be unable to validate their API key. In all cases Agents would continue to send data. Some Agents running in Kubernetes may be marked unhealthy until restarted. Newly started Agents would fail to start. Build jobs using our CI Visibility product would be missing custom tags and measures.
Report: "Delayed Monitors Notifications"
Last updateThis incident has been resolved.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.
We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost, and notifications will be caught up once the service is operational again.
We are investigating delays in Monitors Notifications, which began at 04:05 UTC.
Report: "Elevated Error Rates for Log Queries and Monitors"
Last updateThis incident has been resolved.
Fix has been rolled out and we are currently monitoring to confirm full resolution.
We have successfully tested a fix for this issue and are currently deploying it to resolve this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We're still working on a fix for historical data impacted by this incident.
We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved. At this time, newly ingested data is properly queryable, and monitors targeting Logs sent from 2023-10-03 20:40 UTC onwards are valid. Queries targeting logs between 2023-10-02 11:40 UTC and 2023-10-03 20:40 UTC may return erroneous data. We are evaluating a fix that will restore query correctness for this time-window.
We have identified the underlying issue and are working on a fix.
We are continuing to investigate these issues, and will provide an update as soon as possible.
We are actively investigating issues with Log Queries returning unexpected results. As a result of this issue, some users may experience issues querying logs on the web application or API, and with Logs based Monitors and Log-Based Metrics.
Report: "Delayed Synthetic Browser Test Results"
Last updateWe have scaled up the underlying system and we no longer observe latency in synthetic browser test results.
We have identified an issue that resulted in an increased latency executing Synthetics browser tests. As a result of this issue, some users may experience delays in receiving test results and notifications.
Report: "Monitors notifications lagging"
Last updateThis incident has been resolved. We have lost monitors evaluations between 15:25 UTC and 15:44 UTC, monitors transitioning during this time window that did not transition back have triggered an alert.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Web application not loading"
Last updateThis incident has been resolved.
We have identified the underlying issue and are working on a fix.
We are investigating loading issues on our web application. As a result, some users might be getting errors when loading the web application.