Is Datadog US1 Down Right Now? Discover if there is an ongoing service outage.

Datadog US1 is currently Operational

Last checked Jul 29, 2025 23:18 UTC from Datadog US1's official status page

Historical record of incidents for Datadog US1

Jul 18, 2025

Report: "Google SSO login errors"

Last update 2025-07-18T15:30:47.408Z

investigating2025-07-18T15:30:46.953Z

We are investigating user login issues with the web application via Google SSO. Please note that data processing and alerts are not affected by this incident.

Jul 9, 2025

Report: "Degraded Web Application Performance & Monitor Evaluations"

Last update 2025-07-09T20:46:32.419Z

investigating2025-07-09T20:46:31.984Z

We're investigating an issue with our metrics and monitor evaluations, causing degraded web application performance and skipped monitors

Jul 7, 2025

Report: "Monitors - Delayed Evaluation of logs monitors"

Last update 2025-07-07T14:39:09.772Z

investigating2025-07-07T14:39:09.336Z

We are investigating delays in Monitors Evaluation of logs based monitors., which began at 01:30:00 PM UTC.

Jun 24, 2025

Report: "Logs Monitors - Delayed Evaluations"

Last update 2025-06-24T21:16:52.039Z

investigating2025-06-24T21:16:51.553Z

We are investigating delays in Logs Monitors Evaluations, which began at 8:46 PM UTC.

Jun 9, 2025

Report: "Delayed processing of APM Trace Metrics"

Last update 2025-06-09T22:47:26.222Z

investigating2025-06-09T22:47:25.646Z

We are investigating delayed processing of APM Trace metrics starting around 21:40 UTC. Dashboards and monitors relying on these metrics are affected.

Jun 5, 2025

Report: "Elevated error rates in queries across multiple products"

Last update 2025-06-05T15:08:21.128Z

investigating2025-06-05T15:08:20.510Z

We are actively investigating issues querying data affecting multiple products. As a result of this issue, there might be errors when trying to load data from queries on different pages of the web application or through the API.

May 22, 2025

Report: "Monitors - Delayed Evaluation"

Last update 2025-05-22T19:09:07.342Z

resolved2025-05-22T19:09:06.851Z

This incident has been resolved.

identified2025-05-22T18:10:08.487Z

The issue has been identified and a fix is being implemented.

investigating2025-05-22T17:46:29.000Z

We are investigating delays in Distribution Monitors Evaluation, which began at 5:30pm UTC. Monitors for other types of metrics are evaluating as usual.

May 13, 2025

Report: "Delayed Traces and Spans in APM"

Last update 2025-05-13T22:28:28.959Z

resolved2025-05-13T22:28:28.493Z

The incident is now resolved. APM trace ingestion and all downstream systems, including monitors, have fully recovered and are up to date.

monitoring2025-05-13T20:25:39.185Z

We are monitoring a fix with to increased latency processing in APM Metrics. APM data in live view is current but distributed tracing metrics are delayed by 20 minutes. Monitors sourced from the data are impacted until the data becomes current.

investigating2025-05-13T19:33:01.000Z

As a result of the issue we are monitoring delays in Monitors Evaluation

monitoring2025-05-13T19:20:17.896Z

A fix has been implemented and we are monitoring the results.

investigating2025-05-13T19:06:02.000Z

We are investigating increased latency processing Traces and Spans in APM As a result of this issue, some users may see missing or delayed traces and Spans starting at 18:33 UTC.

Report: "Delayed Traces and Spans in APM"

Last update 2025-05-13T19:06:00.000Z

Investigating2025-05-13T19:06:00.000Z

We are investigating increased latency processing Traces and Spans in APMAs a result of this issue, some users may see missing or delayed traces and Spans starting at 18:45 UTC.

May 2, 2025

Report: "Delayed AWS Metrics and Events"

Last update 2025-05-02T02:13:02.724Z

resolved2025-05-02T02:13:02.351Z

This incident has been resolved.

identified2025-05-02T01:52:19.106Z

A fix has been implemented and recovery is in progress. To prevent spurious alerts, monitors on AWS Metrics and Events remain disabled until recovery is complete.

identified2025-05-02T01:19:32.667Z

The issue has been identified and a fix is being implemented.

investigating2025-05-02T00:56:58.231Z

We are investigating increased latency processing AWS metrics and events. As a result of this issue, some users may see delays or gaps in graphs that contain these metrics and events. To prevent spurious alerts, we have temporarily disabled monitors based on this data.

Report: "Delayed AWS Metrics and Events"

Last update 2025-05-02T00:56:00.000Z

Investigating2025-05-02T00:56:00.000Z

We are investigating increased latency processing AWS metrics and events.As a result of this issue, some users may see delays or gaps in graphs that contain these metrics and events.To prevent spurious alerts, we have temporarily disabled monitors based on this data.

Apr 16, 2025

Report: "Monitors - Delayed Evaluation"

Last update 2025-04-16T13:52:55.968Z

resolved2025-04-16T13:52:55.646Z

This incident has been resolved.

investigating2025-04-16T13:50:58.801Z

The incident has fully recovered. The service is now fully operational.

investigating2025-04-16T13:37:46.177Z

We are investigating delays in Monitors Evaluation, which began at 12:45 UTC.

Report: "Monitors - Delayed Evaluation"

Last update 2025-04-16T13:37:00.000Z

Investigating2025-04-16T13:37:00.000Z

We are investigating delays in Monitors Evaluation, which began at 12:45 UTC.

Mar 26, 2025

Report: "Delayed processing of APM Trace Metrics"

Last update 2025-03-26T20:43:21.530Z

resolved2025-03-26T20:24:56.172Z

This incident has been resolved.

monitoring2025-03-26T20:16:20.199Z

We are continuing to monitor for any further issues.

monitoring2025-03-26T20:16:12.425Z

A fix has been implemented and we are monitoring the results.

identified2025-03-26T20:09:11.727Z

The issue has been identified and a fix is being implemented.

investigating2025-03-26T20:03:21.000Z

We are investigating delayed processing of APM Trace metrics starting around 07:00 UTC. Dashboards and monitors relying on these metrics are affected.

Report: "Delayed processing of APM Trace Metrics"

Last update 2025-03-26T20:24:00.000Z

Resolved2025-03-26T20:24:00.000Z

This incident has been resolved.

Update2025-03-26T20:16:00.000Z

We are continuing to monitor for any further issues.

Monitoring2025-03-26T20:16:00.000Z

A fix has been implemented and we are monitoring the results.

Identified2025-03-26T20:09:00.000Z

The issue has been identified and a fix is being implemented.

Investigating2025-03-26T20:03:00.000Z

We are investigating delayed processing of APM Trace metrics starting around 07:00 UTC. Dashboards and monitors relying on these metrics are affected.

Report: "Login Issues"

Last update 2025-03-26T00:17:38.120Z

resolved2025-03-26T00:17:35.788Z

This incident has been resolved.

identified2025-03-25T21:34:49.958Z

We are continuing to work on a fix for this issue.

identified2025-03-25T19:24:26.918Z

The issue has been identified and a fix is being implemented.

investigating2025-03-25T19:03:40.098Z

We are investigating user login issues related to reCAPTCHA for customers using password login. If you experience an issue with reCAPTCHA, refreshing the page can often mitigate the issue. Please note that data processing and alerts are not affected by this incident.

Report: "Login Issues"

Last update 2025-03-26T00:17:00.000Z

Resolved2025-03-26T00:17:00.000Z

This incident has been resolved.

Update2025-03-25T21:34:00.000Z

We are continuing to work on a fix for this issue.

Identified2025-03-25T19:24:00.000Z

The issue has been identified and a fix is being implemented.

Investigating2025-03-25T19:03:00.000Z

Feb 23, 2025

Report: "Delayed Processing for a Subset of Metrics"

Last update 2025-02-23T15:32:17.694Z

resolved2025-02-23T15:32:17.343Z

This incident has been resolved.

monitoring2025-02-23T14:17:42.573Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2025-02-23T13:06:08.067Z

We have identified the underlying issue and continue to work on a fix. It is important to note that no data has been lost: data is being backfilled and will be available once the service is operational again.

identified2025-02-23T11:24:10.618Z

identified2025-02-23T10:33:43.676Z

We have identified the underlying issue and continue to work on a fix. It is important to note that no data has been lost, and it will be backfilled and available once the service is operational again.

identified2025-02-23T10:02:20.543Z

We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost, and it will be backfilled and available once the service is operational again.

investigating2025-02-23T09:30:54.143Z

We are investigating increased latency processing Trace Metrics. As a result of this issue, some users may see delays or gaps for a subset of their metrics on graphs and statistics on Service Catalog.

Jan 31, 2025

Report: "Degraded Web Application Performance"

Last update 2025-01-31T19:03:21.705Z

resolved2025-01-31T19:03:21.350Z

This incident has been resolved.

monitoring2025-01-31T18:22:01.054Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2025-01-31T16:49:33.887Z

We have identified the underlying issue and are continuing to work on a fix. Degraded web application performance is primarily observed in customers with low network bandwidth.

identified2025-01-31T16:29:15.259Z

We have identified the underlying issue and are working on a fix.

investigating2025-01-31T16:03:52.620Z

We are investigating degraded performance with the web application.

Jan 17, 2025

Report: "Increased delay processing events"

Last update 2025-01-17T16:50:41.801Z

resolved2025-01-17T16:50:41.402Z

This incident has been resolved.

monitoring2025-01-17T15:34:28.477Z

We are continue to monitor the progress of processing the backlog in Events. The majority of the backlog has been processed. Event Monitor evaluation remains delayed while we finish processing the backlog.

identified2025-01-17T14:31:18.621Z

We've implemented a fix, and are currently working through the backlog of delayed Events. Event Monitor evaluation remains delayed while we work through the backlog. All other monitor types have recovered and are currently evaluating.

identified2025-01-17T13:55:35.781Z

We have identified the issue causing delayed ingestion of Events. Alerting evaluation continues to be delayed for Event Monitors, Process Monitors, and Cloud Network monitors. All other monitor types have recovered and are currently evaluating.

investigating2025-01-17T13:46:47.724Z

We are continuing to investigate this issue.

investigating2025-01-17T13:42:49.913Z

We are investigating increased latency processing Events. As a result of this issue, some users may see delays in the event stream or for event queries on dashboards, and event alert evaluation is delayed. This issue also caused a delay in the processing of alerts across other products. We've implemented a fix for this, and are monitoring the recovery of the alert evaluation pipeline. As a result, a subset alerts may be delayed while the system recovers.

Jan 3, 2025

Report: "APM connections retrying"

Last update 2025-01-03T05:34:17.816Z

resolved2025-01-03T05:34:17.536Z

This incident has been resolved.

monitoring2025-01-03T05:28:18.557Z

We have mitigated the cause of transient agent submission errors for APM and customers should no longer observe these errors. The Datadog Agent automatically retries these errors and succeeded on retry; this incident did not result in any data loss

identified2025-01-03T05:15:51.055Z

The issue has been identified and a fix is being implemented.

investigating2025-01-03T05:13:14.179Z

Some US1 customers experiencing degraded performance for APM. Customers may see transient errors, but these should resolve with an automatic retry from the Datadog agent.

Dec 4, 2024

Report: "Delayed APM Distribution Metrics, Data Streams Monitoring Metrics & Monitor Notifications"

Last update 2024-12-04T21:50:08.711Z

resolved2024-12-04T21:50:08.418Z

This incident has been resolved.

monitoring2024-12-04T19:43:39.920Z

A fix has been implemented and we are monitoring the results.

identified2024-12-04T19:05:33.260Z

Data Streams Monitoring metrics and associated monitor notifications based on these metrics have recovered.

identified2024-12-04T18:53:28.096Z

We are continuing to work on a fix for this issue.

identified2024-12-04T18:49:09.079Z

The issue has been identified and a fix is being implemented.

investigating2024-12-04T18:48:43.000Z

We are investigating increased latency in processing APM Distribution Metrics and Data Streams Monitoring Metrics as well as monitors notifications based on these metrics, which began at 17h47 UTC. As a result of this issue, some users may see delays or gaps for these metrics on graphs, including APM pages as well as delayed monitor notifications.

Nov 26, 2024

Report: "Delayed APM data ingestion"

Last update 2024-11-26T20:54:34.982Z

resolved2024-11-26T20:54:33.837Z

This incident has been resolved.

monitoring2024-11-26T19:45:25.611Z

A fix has been implemented and systems are recovering.

investigating2024-11-26T19:12:01.029Z

We are investigating increased ingestion latency of APM data.

Nov 20, 2024

Report: "Monitors - Delayed Evaluation for Distribution Metric Monitors"

Last update 2024-11-20T18:09:28.828Z

resolved2024-11-20T18:09:28.566Z

This incident has been resolved.

monitoring2024-11-20T18:02:10.642Z

We have rolled out out a fix and all distribution monitors are up to date. We are continuing to monitor the customer experience and expect to resolve this incident in the next 30 minutes.

identified2024-11-20T17:42:29.043Z

We are in the process of rolling out a fix that will bring all distribution monitors up to date. We will update again when the issue is resolved.

identified2024-11-20T17:07:02.670Z

The root cause has been identified. We are working on a fix so that distribution metric monitor evaluations are up to date.

investigating2024-11-20T16:46:58.901Z

We are investigating delays in monitor evaluations for monitors based on distribution metrics, starting at 15h35UTC. This is causing a delay in notifications.

investigating2024-11-20T16:16:44.117Z

We are investigating delays in Distribution Metric Monitors Evaluation, which began at 15h35UTC.

Report: "Monitors - Delayed Evaluation"

Last update 2024-11-20T17:06:29.737Z

resolved2024-11-20T17:06:29.470Z

This incident has been resolved.

monitoring2024-11-20T16:53:20.666Z

A fix has been implemented and we are monitoring the results.

investigating2024-11-20T16:36:55.734Z

We are investigating delays in Events-based Monitor Evaluation, which began at 16:00 UTC.

Nov 15, 2024

Report: "Delayed Distribution Metrics"

Last update 2024-11-15T14:51:42.977Z

resolved2024-11-15T14:51:40.611Z

This incident has been resolved. All distribution metrics are being processed and monitors are no longer disabled for distribution metrics.

monitoring2024-11-15T14:05:30.556Z

A fix has been implemented and we are monitoring the results.

investigating2024-11-15T13:33:48.440Z

We are continuing to investigate this issue.

investigating2024-11-15T13:33:24.073Z

We are investigating increased latency processing Distribution Metrics. As a result, some users may see delays or gaps for distribution metrics on graphs, including APM pages. Monitors based on this data may also be delayed. We have identified the problem and are actively working to resolve the issue.

Oct 17, 2024

Report: "Delayed distribution metrics & monitor notifications"

Last update 2024-10-17T19:28:05.144Z

resolved2024-10-17T19:28:04.658Z

This incident has been resolved.

monitoring2024-10-17T19:03:23.791Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2024-10-17T18:36:20.281Z

We have identified the underlying issue and are working on a fix.

investigating2024-10-17T18:16:09.935Z

We are investigating delays in distribution metrics, and on monitors notifications for monitors based on these metrics, which began at 17:40 UTC.

Oct 11, 2024

Report: "Delayed Distribution Metrics"

Last update 2024-10-11T22:21:39.898Z

resolved2024-10-11T22:20:24.000Z

This incident has been resolved. All distribution metrics are being processed and monitors are no longer disabled for distribution metrics.

monitoring2024-10-11T21:13:12.724Z

A fix has been implemented and we are monitoring the results.

identified2024-10-11T20:35:12.771Z

We are continuing to work on a fix for this issue.

identified2024-10-11T19:57:27.037Z

The issue has been identified and remediation steps are underway.

investigating2024-10-11T19:52:05.368Z

We are investigating increased latency processing Distribution Metrics. As a result of this issue, some users may see delays or gaps for distribution metrics on graphs. To prevent spurious alerts, we have temporarily disabled monitors based on distribution metrics.

Oct 4, 2024

Report: "[SSO] Login Errors"

Last update 2024-10-04T01:10:05.653Z

resolved2024-10-04T01:10:05.267Z

This incident has been resolved. If you continue to see issues, please contact Datadog technical support.

monitoring2024-10-04T00:26:42.519Z

A fix has been implemented and we are monitoring the results.

identified2024-10-04T00:04:25.995Z

We are continuing to work on a fix for this issue.

identified2024-10-03T23:01:04.560Z

We have identified the issue and are implementing a fix.

identified2024-10-03T21:31:35.965Z

We are investigating user login issues with the web application when using Okta SSO.

Sep 11, 2024

Report: "Delayed Monitors Notifications"

Last update 2024-09-11T22:28:15.325Z

resolved2024-09-11T22:28:13.734Z

This incident has been resolved.

monitoring2024-09-11T22:12:58.762Z

A fix has been implemented and we are monitoring the results.

identified2024-09-11T22:12:37.900Z

We are continuing to work on a fix for this issue.

identified2024-09-11T21:51:33.036Z

We are continuing to work on a fix for this issue.

identified2024-09-11T20:59:36.359Z

The issue has been identified and a fix is being implemented.

investigating2024-09-11T20:59:26.035Z

We are investigating delays in Monitors Notifications for distribution metrics, which began at 20:00 UTC.

Report: "Degraded Web Application Performance"

Last update 2024-09-11T19:57:27.358Z

resolved2024-09-11T19:57:26.943Z

This incident has been resolved.

identified2024-09-11T19:46:50.500Z

The issue has been identified and a fix is being implemented.

investigating2024-09-11T19:28:29.459Z

We are continuing to investigate this issue.

investigating2024-09-11T19:17:09.747Z

We are investigating degraded performance with the web application related to metrics-based widgets.

Aug 30, 2024

Report: "Web UI features maybe hidden"

Last update 2024-08-30T15:21:07.764Z

resolved2024-08-30T15:21:07.321Z

This incident has been resolved. Please refresh your Datadog web page to resolve the issue completely.

monitoring2024-08-30T15:14:48.825Z

A fix has been implemented and we are monitoring the results.

identified2024-08-30T15:09:34.101Z

The issue has been identified and a fix is being implemented.

investigating2024-08-30T14:55:44.948Z

We are currently investigating an issue, that is causing certain features to be hidden from our UI. There is no data loss or monitoring impact.

Aug 29, 2024

Report: "Delayed Monitors Notifications"

Last update 2024-08-29T10:41:32.162Z

resolved2024-08-29T10:41:31.681Z

This incident has been resolved.

monitoring2024-08-29T10:32:32.486Z

A fix has been implemented and we are monitoring the results.

investigating2024-08-29T10:22:32.440Z

We are investigating delays in Monitors Notifications, which began at 0605 ET.

Aug 28, 2024

Report: "Delayed Metrics Monitor Evaluations"

Last update 2024-08-28T22:40:48.004Z

resolved2024-08-28T22:40:47.628Z

This incident has been resolved.

investigating2024-08-28T22:13:15.010Z

Monitors with long intervals may still be delayed but the service is recovered.

investigating2024-08-28T21:38:21.544Z

We have identified the issue and deployed a fix, we are monitoring the recovery.

investigating2024-08-28T21:00:02.864Z

We are investigating increased metrics based monitor delays for some customers.

Aug 14, 2024

Report: "Delayed Monitors Evaluations"

Last update 2024-08-14T19:31:39.846Z

resolved2024-08-14T19:31:39.446Z

This incident has been resolved.

monitoring2024-08-14T19:24:47.775Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

investigating2024-08-14T18:53:03.088Z

We are investigating delayed evaluation of a subset of metric monitors. Customers may experience delayed or missing monitor notifications as a result.

Aug 8, 2024

Report: "APM - degraded performance"

Last update 2024-08-08T16:13:54.914Z

resolved2024-08-08T16:13:54.500Z

This incident has been resolved.

monitoring2024-08-08T16:04:21.209Z

A fix has been implemented and we are monitoring the results.

investigating2024-08-08T15:47:17.737Z

We are investigating an issue in executing trace queries, the team is working on a fix

Report: "CI Visibility - Page Load issue"

Last update 2024-08-08T16:13:38.100Z

resolved2024-08-08T16:13:37.573Z

This incident has been resolved.

monitoring2024-08-08T16:04:40.962Z

A fix has been implemented and we are monitoring the results.

investigating2024-08-08T15:45:10.048Z

We have identified an issue that prevents most Software Delivery pages from loading. Also, Intelligent Test Runner, Quality Gates and GitHub PR comments are affected

Report: "Application Security Management - Issue Updating Configurations"

Last update 2024-08-08T16:13:21.169Z

resolved2024-08-08T16:13:20.752Z

This incident has been resolved.

monitoring2024-08-08T16:05:11.860Z

A fix has been implemented and we are monitoring the results.

investigating2024-08-08T15:44:13.028Z

We are investigating an issue in updating configurations in the product, the team is working on a fix

Report: "Partial outage on components of RUM product"

Last update 2024-08-08T16:13:01.199Z

resolved2024-08-08T16:13:00.769Z

This incident has been resolved.

monitoring2024-08-08T16:05:28.170Z

A fix has been implemented and we are monitoring the results.

investigating2024-08-08T15:41:43.152Z

We have identified an issue which affects the use of Sankey and Cohorts Analysis in the RUM product, the team is working on a fix.

Jul 24, 2024

Report: "Delayed Monitors Notifications"

Last update 2024-07-24T14:32:08.020Z

resolved2024-07-24T14:32:07.716Z

This incident has been resolved.

identified2024-07-24T14:17:44.079Z

We identified a delay in Monitor Notifications from 13:52 UTC and 14:05 UTC. The issue has resolved, but we continue to monitor the situation.

investigating2024-07-24T14:16:10.139Z

We identified a delay in Monitor Notifications from 13:52 UTC and 14:05 UTC. The issue has resolved, but we continue to monitor the situation.

Jul 22, 2024

Report: "Delayed AWS, GCP, Azure, and SaaS Integration Metrics"

Last update 2024-07-22T21:46:49.201Z

resolved2024-07-22T21:46:48.827Z

This incident has been resolved.

monitoring2024-07-22T21:39:29.338Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

investigating2024-07-22T21:17:39.035Z

We are investigating increased latency processing some AWS, GCP, Azure and SaaS Integration Metrics. As a result of this issue, some users may see delays or gaps in graphs that contain these metrics. To prevent spurious alerts, we have temporarily disabled monitors based on this data.

Jul 5, 2024

Report: "Delayed Monitors Notifications"

Last update 2024-07-05T18:21:00.249Z

resolved2024-07-05T18:20:59.918Z

This incident has been resolved.

monitoring2024-07-05T17:14:38.725Z

We are finalizing our recovery and at this time expect customers should see no further impact. We will continue to monitor for issues.

identified2024-07-05T16:57:20.905Z

We are seeing continuing improvements and recovering as quickly as possible while maintaining system stability. Distribution metrics remain delayed and associated monitors evaluation are currently skipped. Point metrics and associated monitors are fully recovered.

identified2024-07-05T16:11:27.000Z

We are seeing continuing improvements. Distribution metrics remain delayed and associated monitors evaluation are currently skipped.

identified2024-07-05T16:10:22.128Z

We are seeing continuing improvements. Distribution metrics remain delayed and associated monitors evaluation are currently skipped.

investigating2024-07-05T15:46:54.757Z

We are seeing improvements on metrics processing. Distribution metrics remain delayed and associated monitors evaluation are currently skipped.

investigating2024-07-05T15:23:41.597Z

We are investigating issues in metrics processing, leading to impact on monitors evaluation, dashboards as well as other products.

investigating2024-07-05T15:11:58.049Z

We are continuing to investigate this issue.

investigating2024-07-05T15:00:34.300Z

We are investigating delays in Monitors Notifications, which began at 14:40 UTC.

Jul 3, 2024

Report: "We are investigating user login issues with the web application"

Last update 2024-07-03T16:13:22.648Z

resolved2024-07-03T16:13:22.247Z

This incident has been resolved.

monitoring2024-07-03T16:02:28.449Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2024-07-03T15:25:42.493Z

We have identified the underlying issue and are working on a fix.

investigating2024-07-03T14:34:35.775Z

We are investigating user login issues with the web application login by email. Please note that data processing and alerts are not affected by this incident.

May 13, 2024

Report: "Metrics historical data failed queries"

Last update 2024-05-13T21:55:12.431Z

resolved2024-05-13T21:55:11.899Z

Our metrics system has recovered and all historical metrics are now queryable.

monitoring2024-05-13T21:18:14.174Z

The system continues to recover. Data is available, but some results may be slow or incomplete until full recovery. Our teams continue to monitor the incident.

monitoring2024-05-13T20:17:10.148Z

A fix has been implemented. While the system continues to recover, data will be available but some results may be slow or incomplete until full recovery is complete.

identified2024-05-13T19:40:05.899Z

We are continuing to work on a fix for this issue.

identified2024-05-13T18:41:26.639Z

We are continuing to work on a fix for this issue.

identified2024-05-13T18:14:12.744Z

The issue has been identified and a fix is being implemented.

investigating2024-05-13T17:59:44.802Z

We are investigating queries failing for historical data for metrics, impacting timeframes more than one day ago. Queries for recent data are not affected by this incident.

May 2, 2024

Report: "Partial outage of metrics query"

Last update 2024-05-02T20:55:24.383Z

resolved2024-05-02T20:55:23.788Z

This incident has been resolved.

monitoring2024-05-02T20:21:10.248Z

A fix has been implemented and we are monitoring the results.

identified2024-05-02T20:04:46.070Z

The issue has been identified and a fix is being implemented.

investigating2024-05-02T20:00:42.485Z

We are currently investigating this issue.

Apr 11, 2024

Report: "Elevated Error Rates for Metrics Submission"

Last update 2024-04-11T21:10:04.593Z

resolved2024-04-11T21:10:04.064Z

This incident has been resolved.

identified2024-04-11T20:27:39.295Z

The issue has been identified and a fix is being implemented.

investigating2024-04-11T19:56:42.541Z

We are investigating elevated error rates for Metrics Submission APIs. As a result of this issue, submitting new metric data through the API might fail temporarily. Please note that the Datadog Agent and Client Libraries will buffer data or retry to avoid data loss.

Mar 6, 2024

Report: "Elevated Errors for API Key Validation"

Last update 2024-03-06T19:45:45.602Z

resolved2024-03-06T17:00:00.000Z

From 12:45-1:15 PM US EST Datadog’s endpoint to validate Datadog API keys was unavailable. During this window Datadog Agents would be unable to validate their API key. In all cases Agents would continue to send data. Some Agents running in Kubernetes may be marked unhealthy until restarted. Newly started Agents would fail to start. Build jobs using our CI Visibility product would be missing custom tags and measures.

Mar 5, 2024

Report: "Google SSO login issues for web application"

Last update 2024-03-05T17:37:19.648Z

resolved2024-03-05T17:37:18.837Z

This incident has been resolved.

identified2024-03-05T17:14:56.670Z

We are continuing to monitor progress. We will post further updates when we have them.

identified2024-03-05T16:43:50.388Z

We are seeing signs of recovery and are continuing to monitor progress. We will post further updates when we have them.

investigating2024-03-05T16:27:43.256Z

We are investigating user login issues with the web application via Google SSO. Users switching orgs might also be affected. Please note that data processing and alerts are not affected by this incident.

Jan 31, 2024

Report: "Elevated Error Rates for Metrics Submission"

Last update 2024-01-31T08:14:25.526Z

resolved2024-01-31T08:14:25.509Z

This incident has been resolved.

monitoring2024-01-31T07:59:10.772Z

A fix has been implemented and we are monitoring recovery. Metric monitor evaluations still might be delayed; we will post an update when this recovers.

identified2024-01-31T07:40:19.756Z

We are still investigating elevated error rates for Metrics Submission APIs and delays processing metrics monitors.

investigating2024-01-31T07:18:10.648Z

Jan 22, 2024

Report: "Logs Status Elevated Error Rates"

Last update 2024-01-22T22:01:56.598Z

resolved2024-01-22T21:31:34.000Z

This incident has been resolved. As a result of this incident, logs from AWS Lambda (specifically, those tagged with source:lambda) were incorrectly categorized as errors from 18:30 UTC to 20:50 UTC on 2024-01-22. All logs after this date are being processed as normal.

monitoring2024-01-22T21:24:06.473Z

A fix has been implemented and we are monitoring the results.

identified2024-01-22T20:31:20.260Z

We have identified the underlying issues for elevated error rates for Log Management As a result of this issue, some users may see incorrect statuses for logs from AWS Lambda

Report: "Delayed Infrastructure Updates"

Last update 2024-01-22T20:38:02.853Z

resolved2024-01-22T20:38:02.829Z

This incident has been resolved.

monitoring2024-01-22T19:46:50.507Z

We have deployed a fix and we are monitoring the results. We will provide another update in 30 minutes once the service is fully operational.

investigating2024-01-22T19:04:17.993Z

We are investigating increased latency processing host updates. As a result of this issue, some users may see delays in host activity status updates on the infrastructure list.

Jan 4, 2024

Report: "Elevated error rate for metrics and delayed metric monitors"

Last update 2024-01-04T17:20:01.906Z

resolved2024-01-04T17:19:59.898Z

This incident has been resolved.

monitoring2024-01-04T16:56:44.682Z

A fix has been implemented and we are monitoring the results.

identified2024-01-04T16:49:20.261Z

The issue has been identified and a fix is being implemented.

investigating2024-01-04T16:37:40.700Z

We are actively investigating elevated errors and slow queries for metrics data. As a result of this issue, some users may see errors when trying to load data on dashboards and metrics monitors evaluation may be delayed.

Dec 23, 2023

Report: "Delayed Monitors Notifications"

Last update 2023-12-23T00:57:31.411Z

resolved2023-12-23T00:57:30.767Z

This incident has been resolved.

monitoring2023-12-23T00:42:17.486Z

A fix has been implemented and we are monitoring the results.

identified2023-12-23T00:40:43.337Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2023-12-23T00:34:17.814Z

We are continuing to work on a fix for this issue.

identified2023-12-23T00:23:16.472Z

We are continuing to work on a fix for this issue.

identified2023-12-23T00:13:30.710Z

We have identified the underlying issue and are working on a fix.

investigating2023-12-23T00:08:17.678Z

We are investigating delays in the processing of Metrics and corresponding Monitor Notifications, which began at 22:30 UTC.

Oct 27, 2023

Report: "Delayed Monitors Notifications"

Last update 2023-10-27T17:14:01.244Z

resolved2023-10-27T17:14:00.564Z

This incident has been resolved.

monitoring2023-10-27T17:06:01.477Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

investigating2023-10-27T16:36:01.233Z

We are continuing to investigate the issue.

investigating2023-10-27T16:14:04.134Z

We are investigating delays in Monitors Notifications, which began at 15:16 UTC]

Oct 24, 2023

Report: "Degraded Web Application Performance"

Last update 2023-10-24T18:12:59.069Z

resolved2023-10-24T17:38:42.667Z

This incident has been resolved.

monitoring2023-10-24T17:32:41.702Z

A fix has been implemented and we are monitoring the results.

identified2023-10-24T17:17:26.016Z

The issue has been identified and we are working on a fix. Note that this only affects Monitors, SLOs and Incident Management web apis.

investigating2023-10-24T17:11:34.253Z

We are investigating degraded performance with the web application.

Oct 4, 2023

Report: "Elevated Error Rates for Log Queries and Monitors"

Last update 2023-10-04T15:44:56.433Z

resolved2023-10-04T15:44:56.419Z

This incident has been resolved.

monitoring2023-10-04T15:44:28.152Z

Fix rollout has now been completed.

monitoring2023-10-04T14:50:58.801Z

The fix rollout is currently ongoing. Once completed we will confirm resolution.

monitoring2023-10-04T13:54:43.808Z

The fix rollout is currently ongoing. Once completed we will confirm resolution.

monitoring2023-10-04T13:04:39.988Z

We have successfully tested a fix for this issue and are currently deploying it to resolve this incident.

monitoring2023-10-04T12:18:40.728Z

We're still working on a fix for historical data impacted by this incident.

monitoring2023-10-04T11:41:47.582Z

We're still working on a fix for historical data impacted by this incident.

monitoring2023-10-04T11:07:01.752Z

We're still working on a fix for historical data impacted by this incident.

monitoring2023-10-04T10:26:46.091Z

We're still working on a fix for historical data impacted by this incident.

monitoring2023-10-04T09:19:11.270Z

We're still working on a fix for historical data impacted by this incident.

monitoring2023-10-03T20:47:44.921Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved. At this time, newly ingested data is properly queryable, and monitors targeting Logs sent from 2023-10-03 20:40 UTC onwards are valid. Queries targeting logs between 2023-10-02 11:40 UTC and 2023-10-03 20:40 UTC may return erroneous data. We are evaluating a fix that will restore query correctness for this time-window.

identified2023-10-03T19:32:33.703Z

We have identified the underlying issue and are working on a fix.

investigating2023-10-03T18:49:55.643Z

We are continuing to investigate these issues, and will provide an update as soon as possible.

investigating2023-10-03T17:30:48.000Z

We are actively investigating issues with Log Queries returning unexpected results. As a result of this issue, some users may experience issues querying logs on the web application or API, and with Logs based Monitors and Log-Based Metrics.

Sep 26, 2023

Report: "Delayed Metric Monitor Notifications"

Last update 2023-09-26T04:25:18.322Z

resolved2023-09-26T04:25:18.308Z

This incident has been resolved.

identified2023-09-26T03:55:45.493Z

We have identified the underlying issue and are working on a fix. It is important to note that no data has been lost, and notifications will be caught up once the service is operational again.

investigating2023-09-26T02:56:30.248Z

We are investigating delays in Metrics Monitors Notifications, which began at 02:35 UTC.

Sep 25, 2023

Report: "Monitors Notifications Delayed"

Last update 2023-09-25T21:55:20.080Z

resolved2023-09-25T21:55:20.067Z

This incident has been resolved.

monitoring2023-09-25T21:48:33.113Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2023-09-25T21:14:48.756Z

We are aware of delays in Metric Monitors Notifications, which began at 20:55 UTC. We have identified the underlying issue and are working on a fix.

Sep 22, 2023

Report: "Delayed Processing for a Subset of Metrics"

Last update 2023-09-22T16:07:31.741Z

resolved2023-09-22T16:07:31.725Z

This incident has been resolved.

investigating2023-09-22T15:50:16.026Z

We are continuing to investigate the issue. To prevent spurious alerts, we have temporarily disabled affected monitors based on this data.

investigating2023-09-22T15:48:22.282Z

We are investigating increased latency processing Processing for a Subset of Metrics. As a result of this issue, some users may see delays or gaps for a subset of their metrics on graphs.

Sep 12, 2023

Report: "Delayed monitor notifications & metrics graphing issues"

Last update 2023-09-12T02:13:31.940Z

resolved2023-09-12T02:13:31.924Z

This incident has been resolved.

monitoring2023-09-12T01:16:53.117Z

Users may still be experiencing some issues with graphs not loading in the web application. We will provide another update once the issue is fully resolved.

monitoring2023-09-12T00:07:45.146Z

Users may still be experiencing some issues with graphs not loading in the web application. We will provide another update once the issue is fully resolved.

monitoring2023-09-11T23:01:26.047Z

Users may still be experiencing some issues with graphs not loading in the web application. We will provide another update once the issue is fully resolved.

monitoring2023-09-11T21:59:21.018Z

Issues with monitor notifications have been resolved. Users may still be experiencing some issues with graphs not loading in the web application. We will provide another update once the issue is fully resolved.

monitoring2023-09-11T21:30:10.883Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

investigating2023-09-11T20:42:25.584Z

We are continuing to investigate the issue.

investigating2023-09-11T19:36:59.643Z

We are continuing to investigate the issue.

investigating2023-09-11T18:47:06.052Z

We are investigating delays in Monitors Notifications for monitors which rely on distribution metrics, which began at 17:58 UTC. Users may also experience some issues with graphs not loading in the web application. Please note that data ingest is not affected by this incident.

Sep 1, 2023

Report: "Delayed Monitors Notifications"

Last update 2023-09-01T15:31:20.583Z

resolved2023-09-01T15:31:20.568Z

This incident has been resolved.

monitoring2023-09-01T15:18:01.484Z

We have deployed a fix and we are monitoring the results. We will provide another update once the issue is fully resolved.

identified2023-09-01T15:00:28.111Z

We have identified the underlying issue and are working on a fix.

investigating2023-09-01T14:55:14.787Z

We are investigating delays in Monitors Notifications affecting, which began at 13:53 UTC.