Daily

Is Daily Down Right Now? Discover if there is an ongoing service outage.

Daily is currently Operational

Last checked from Daily's official status page

Historical record of incidents for Daily

Report: "Issues with logging and telemetry"

Last update
investigating

Daily is being affected by widespread internet outages in authentication services. This is currently affecting our ability to collect or view some call metrics. We'll post more as soon as we have more information.

Report: "Elevated error rates for SIP/PSTN"

Last update
resolved

This incident has been resolved.

monitoring

It appears that SignalWire rolled back the change, and the errors have stopped. We are monitoring for any further issues.

identified

SignalWire has confirmed that they made an unannounced change that's causing the problem, and they're working to revert it. We're also working on a quick production update to work around the issue if they can't revert quickly enough.

investigating

We're investigating elevated error rates from SignalWire when provisioning SIP/PSTN resources. If you're creating rooms with dial-in or dial-out enabled and it isn't absolutely necessary, you can remove those params from your room creation request to successfully create rooms.

Report: "Elevated error rates for SIP/PSTN"

Last update
Investigating

We're investigating elevated error rates from SignalWire when provisioning SIP/PSTN resources. If you're creating rooms with dial-in or dial-out enabled and it isn't absolutely necessary, you can remove those params from your room creation request to successfully create rooms.

Report: "Issues with SIP/PSTN audio quality"

Last update
resolved

This incident has been resolved.

monitoring

We've deployed an update to production servers to remediate the issue, and we're testing to ensure everything is fixed.

identified

We've identified an issue with how Daily and SignalWire are negotiating audio codecs for incoming SIP/PSTN calls. We're working on an update that will change the default audio codec used between Daily and SignalWire to Opus. Once this is deployed, you may still experience audio issues if you've explicitly set your SIP audio codec to PCMU as described here: https://docs.daily.co/guides/products/dial-in-dial-out/sip#sip-dial-in-audio-and-video

identified

We've identified an issue causing degraded ("broken up" or "choppy") audio between some Daily sessions and SignalWire SIP/PSTN endpoints. PSTN dialout and PIN dialin do not seem to be affected. SIP dialout seems to be moderately affected. PIN-less PSTN dialin and SIP dialin seem to be experiencing a much higher proportion of affected calls.

investigating

We're investigating reports of poor quality audio for SIP/PSTN participants in some calls.

Report: "Issues with SIP/PSTN audio quality"

Last update
Investigating

We're investigating reports of poor quality audio for SIP/PSTN participants in some calls.

Report: "Elevated SIP/PSTN error rates"

Last update
resolved

This incident has been resolved.

monitoring

We're still waiting on some additional fixes from Signalwire. Room creation with dial-in/dial-out is working, but you still may experience problems updating dial-in/dial-out settings for existing rooms. Daily Bots and Pipecat Cloud users should be unaffected.

monitoring

Signalwire has restored service, but we're still seeing some error responses from their API. We believe the majority of room creation failure issues have been resolved. We'll post here again when we've handled this last issue.

identified

We've been told we should be back online after a hotfix at approximately 18:00 UTC, or 15 minutes from now. We'll post another update as soon as we have more info.

identified

We're still waiting on Signalwire to restore service.

identified

We're still monitoring. In addition to dialout_enabled, you'll need to remove other room properties related to SIP/PSTN, dialin, and/or dialout to create rooms.

identified

We're continuing to monitor an issue from our SIP/PSTN provider. The rest of the Daily platform is unaffected. If you're getting an error when trying to create a room, you can remove the dialout_enabled property from the room creation request and try again.

identified

Our SIP/PSTN provider has identified an issue and they're deploying a fix. In the meantime, you should still be able to create Daily rooms without provisioning SIP/PSTN.

investigating

We're seeing elevated error rates from our SIP/PSTN provider. If you're creating rooms with dial-in support and getting errors, you may want to retry creating those rooms without dial-in, and then add dial-in with an update REST request.

Report: "Elevated SIP/PSTN error rates"

Last update
Investigating

We're seeing elevated error rates from our SIP/PSTN provider. If you're creating rooms with dial-in support and getting errors, you may want to retry creating those rooms without dial-in, and then add dial-in with an update REST request.

Report: "Delayed audio for some SIP/PSTN dial-in calls"

Last update
resolved

This incident has been resolved.

monitoring

We've deployed a fix, and we're monitoring for any further issues.

identified

We've identified the issue, and we're testing a fix in our staging infrastructure. We'll post another update when we've deployed the fix.

identified

We're getting reports of delayed audio from some customers using SIP dialin and dialout. When the phone user joins the call, they can talk and others will hear them, but the phone user won't hear any audio from other call participants (bot or human) for the first 20-30 seconds. We're continuing to troubleshoot the issue, and we'll post here as soon as we have more info.

investigating

We're investigating an issue that's causing delays in audio connection for a some SIP/PSTN calls.

Report: "Issues connecting to rooms"

Last update
resolved

This issue has been resolved.

monitoring

We've resolved the issue and we're monitoring to ensure the platform is operating normally.

investigating

We're investigating an issue that may be preventing some users from joining meeting rooms.

Report: "Networking issues"

Last update
resolved

We've deployed an update that increases the throughput of the database that was the bottleneck in today's incident. We'll have more info about additional remediations and a postmortem for today's incident within the next few days.

monitoring

Our metrics have stayed at normal levels since our remediating actions about 30 minutes ago. We're continuing to monitor the platform while we discuss longer-term solutions to make absolutely sure we've addressed the root cause here.

monitoring

We've made some changes to the affected database, and our metrics and error rates have returned to normal. We've also re-enabled delivery of all webhooks, and we're monitoring for any further issues.

identified

We're addressing an issue with an internal database that's causing problems with existing meetings, as well as starting new ones. Your users are likely seeing some failures when trying to join meeting sessions, and users in ongoing sessions are seeing occasional meeting moves. We'll post more information as soon as soon as it's available.

identified

We've temporarily disabled the component that sends webhooks.

investigating

We're continuing to investigate the source of meeting disruptions. Customers may be experiencing 'meeting moves' where a call session moves from one server to another, causing a 2-3 second disruption to the call. You may also see delays in receiving meeting.started and meeting.ended webhooks.

investigating

We're investigating issues that may be causing problems with network connections between regions.

Report: "Issues starting recordings"

Last update
resolved

We're still making a few small infrastructure changes, but our internal metrics have been back at normal levels for some time.

identified

We've identified an issue causing some recordings to fail to start, specifically in the Oracle Cloud San Jose region. We've already made some infrastructure changes that should be routing new recording requests to other regions. If you've seen a recording fail to start, you can try starting it again using daily-js or the REST API. We'll keep you posted on our progress resolving the issue.

Report: "Delayed API calls"

Last update
postmortem

On Tuesday, October 22, around 17:15 UTC \(9:15 AM PDT\), a Daily customer started running a series of load tests. Their test involved rapidly creating and deleting a large number of rooms that used PSTN dial-out, cloud recording, and webhooks. This eventually caused several capacity threshold alerts to fire around 18:15 UTC \(10:15 AM PDT\) as our system scaled out to handle the load. We noticed that their test was running a script that created a room and started dial-out, but almost every instance of the script was exiting the room uncleanly before the outgoing call even connected to anything. This exposed an edge case that caused a ‘zombie’ PSTN participant to stay in that session and continue to try and send presence updates indefinitely. We’re already working on fixing that bug. This has probably happened before, but in much smaller quantities, since it involves a very unusual combination of events—but since this was an automated load test, it was causing too many of these ‘zombies’ to build up, all trying to write frequent presence updates to the database. Soon, the database response time began to slow under the increased load. Around that same time \(18:15 UTC, 10:15 AM PDT\), we noticed an increase in API error rates—specifically, actions that required writing to the database. Our team started to work both problems at once: safely get rid of the ‘zombie’ sessions without affecting other customers, and alleviate the load on the database to improve API response times. API error rates for POST requests spiked as high as 8%, and error rates for all requests peaked at 2-3%. We were able to return API error levels and latency back to normal by around 19:50 UTC \(12:50 PDT\) by refreshing several database instances. We contacted the customer and stopped the load tests, and then we were able to remove the ‘zombie’ sessions through our normal deploy process. We’re sorry for the disruption this caused. We’re already working on several remediations, including fixing the bug that caused the ‘zombie’ sessions, as well as adjusting platform rate limits to prevent this from happening again.

resolved

This issue has been resolved. We will post more information about this incident in the near future.

monitoring

API latency and errors have stayed at normal levels for a while now, but we're continuing to monitor for any further impact.

identified

API error levels have decreased considerably, but we're still working on full remediation. More updates to come.

identified

We've identified an issue causing some slowdowns in one of our databases, leading to some delayed or failed API responses. We've solved the root cause of the issue, but we're being cautious about restoring the database to full functionality, so we expect the delays to continue for a short time.

investigating

We're investigating an issue that's causing delays with some API operations, such as creating rooms and starting recordings. We'll post more info as soon as we have it.

Report: "Missing meeting webhook deliveries"

Last update
resolved

This incident has been resolved. Customers needing assistance with missing webhook deliveries should contact support via help@daily.co.

monitoring

Between 14:01 UTC and 17:12 UTC webhooks for meeting.started and meeting.ended events were not delivered. We have applied a mitigation and are continuing to monitor. The underlying cause for the missing deliveries is still under investigation.

Report: "dashboard.daily.co availability"

Last update
resolved

This incident has been resolved.

monitoring

The upstream issue has been resolved, and we're monitoring for any more issues.

monitoring

The upstream issue has been resolved, and we're monitoring for any more issues.

investigating

Some customers are seeing 400 BAD_REQUEST messages when trying to load dashboard.daily.co. This is likely related to a Vercel incident: https://www.vercel-status.com/incidents/f6b2blrl5f5f

Report: "Elevated latency on some API endpoints"

Last update
resolved

API latency has returned to normal levels.

monitoring

We have applied a mitigation and are continuing to monitor the situation.

investigating

We are currently investigating increased latency affecting some of our APIs.

Report: "Degraded logging and metrics API performance"

Last update
resolved

The impaired database system has fully recovered and is operating normally. API performance has returned to normal levels.

monitoring

The degraded logging and metrics API performance was the result of an impaired database system. The initial impact was resolved earlier today, but we continue to monitor the system as recovery completes.

investigating

We are currently investigating an issue with degraded performance with the logging and metrics API.

Report: "Elevated latency / intermittent failures on API endpoints"

Last update
resolved

Network-level issues are resolved and service is operating nominally.

monitoring

Network-level mitigations have been applied and we are seeing latency back to normal levels.

investigating

We're currently investigating this issue.

Report: "Issue with sessions in ap-northeast-2"

Last update
resolved

This issue has been resolved, and call sessions in ap-northeast-2 are working normally.

investigating

We've confirmed an issue preventing some users from joining calls hosted in the ap-northeast-2 region. Sessions in other regions are unaffected. If you've set the 'geo' property on your domain or a specific room to 'ap-northeast-2', you may want to temporarily change it to 'ap-south-1' or another nearby region.

investigating

We're investigating an issue preventing some users from joining call sessions in the ap-northeast-2 region.

Report: "Problems connecting to rooms"

Last update
resolved

This incident has been resolved.

monitoring

We've identified an issue that was causing some users to receive an error when trying to join a call. Affected users would see an error in the console starting with "web socket connection failed". We've rolled back a platform update from earlier today, and the errors have stopped. We're still diagnosing the problem with the platform update, but operations are back to normal.

investigating

We're investigating reports of problems when trying to join calls.

Report: "Issues connecting to calls"

Last update
postmortem

On Tuesday, February 7 at 9:47 AM Eastern time \(14:47 UTC\), our database reported a performance issue under normal operational load. We had upgraded the database server over the weekend, but it had been operating normally since Monday. The alerts indicated a high level of lock contention on the newly upgraded database, which was causing problems for our call servers \(SFUs\). The SFUs are designed to shut themselves down if they are not able to connect to our database. When an SFU shuts down, our autoscaling will start a new SFU to replace it. With several SFUs shutting down at the same time \(and several new ones starting\), we experienced a larger than normal volume of “meeting moves”, which added additional load to a database that was already struggling. A “meeting move” occurs when an old SFU is shutting down. Our webapp automatically moves any ongoing call sessions on that SFU to a different SFU. During a meeting move, users will usually notice everyone else’s video drop out for a second or two before reappearing. The next few paragraphs shows the sequence of events between 09:51 to 10:47 which helped us identify the cause. By 9:51 \(T\+4 minutes\), engineers had found a potential culprit: a large volume of queries stuck in a deadlock. These were “meeting events” from the SFU, noting when participants joined or left meetings. This was causing the webapp API requests to time out and return 5xx errors, and ultimately causing the SFUs to drop their connections and restart. By 10:13 \(T\+26 minutes\), we had found one potential cause of the deadlocks. After our database migration from the previous weekend, we were still using MySQL binary log replication to keep our old database up to date. We disabled binlog replication and restarted the database to try and reduce the overall load on the database. This helped, but many of the SFUs retried the queries that were causing the deadlocks, so the problem persisted. We continued investigating, and also contacted AWS support to see if they had any insight on the issue. At 10:47 AM \(T\+1 hour\), engineers were working on a script that would terminate stuck queries when the database suddenly restarted itself. This restart took slightly longer than the one at 10:13, and it allowed the SFUs to discard the now-stale meeting updates without being disconnected long enough to cause them to restart. At this time, the SFUs and the platform went back to normal operation. We were ultimately able to prove that the deadlocking behavior was caused by a low-level behavior change introduced in a point release of MySQL. Our database maintenance from the previous weekend had upgraded us to that version and introduced the change. Working around that behavior change involved updating an index on one affected table. We spent the rest of the week developing and testing a plan to update the production database, and we completed that work with no user impact on Saturday evening. At 11:01, we decided we could move into a monitoring state while continuing to investigate the root cause. We left the status incident in a “monitoring” state until Friday, because we wanted to make sure we fully understood the initial cause of the deadlocks took any necessary action to avoid it in the future. One such action was the addition of rate limiting to the room creation API endpoint. The overall impact of this incident was limited to almost exactly one hour, between 14:47 and 15:47 UTC. During that time, some users in Daily calls experienced the “meeting moves” described earlier. There may have been a small number of users that weren’t able to join a room if they happened to try in the middle of a “meeting move”, which lasts a few 10s of seconds. They would have joined on reattempting a few seconds later. Similarly, some REST API requests may have returned 5xx error codes as well. We are continuing to work with AWS to make sure that the deadlocks issue we saw in production with Aurora MySQL 2.11.0 is fully documented, understood, and fixed in a future release. A more conservative approach to deadlocks was a known change in MySQL 5.7 \(which Aurora MySQL 2 is based on\). However, the severity of the deadlocks that we experienced during this incident was a surprise to us and to the AWS Aurora team. We try hard to test all infrastructure changes under production-like workloads. In this case, we failed to test with a synthetic workload that had the right “shape” to trigger these deadlocks. As a result of this incident, we have added additional API request patterns to our testing workload. We’ve also added some new production monitoring alarms that are targeted at more fine-grained database metrics.

resolved

We've identified the issue that caused the incident on Tuesday morning. While we've already deployed fixes that helped prevent the problem from reoccurring, we still need to perform one more database update that will require a short scheduled maintenance. That will likely happen this weekend. We will post a full retro after completing the final database maintenance operation.

monitoring

We've deployed a platform update with a few improvements designed to mitigate the impact of the current database performance issue. The only thing you may notice is that you'll no longer see 429 rate limit responses in your Dashboard API logs. Our database metrics have remained normal today, but we'll continue to monitor the platform to verify these fixes and watch for further issues.

monitoring

While we were able to restore platform functionality earlier today, we've continued to troubleshoot the underlying issue that caused the problem. As a precautionary measure, we've temporarily enabled rate limiting on the REST API endpoint used to create rooms. The limit for <tt>POST /rooms</tt> is now the same as the <a href="https://docs.daily.co/reference/rest-api#rate-limits">DELETE /rooms/:name endpoint</a>. You can expect about 2 requests per second, or 50 over a 30-second window.

monitoring

We’ve addressed the issue with the database, and platform operations have returned to normal. We are monitoring alerts and metrics for any further issues.

identified

We've identified an issue with one of our databases that coordinates activity between call servers. This is causing elevated rates of "meeting moves", which is when an ongoing call session has to move from one call server to a different one. If you're in a call when this happens, you'll notice everyone's video and audio drop out and come back within a few seconds. You may also need to restart recording or live streaming when this happens. You may also experience timeouts when making REST API requests. We'll post more information as soon as it's available.

investigating

We are investigating elevated platform error rates. Users may get websocket connection errors when trying to join calls.

Report: "Issues connecting to calls."

Last update
resolved

This incident has been resolved.

monitoring

A fix has been applied and we are monitoring to be sure that all underlying issues are resolved.

identified

We have identified the issue and are applying a fix.

identified

The issue has been identified and a fix is being implemented.

investigating

We're investigating an issue preventing some users from connecting to calls.

Report: "Missing metrics in call participant logs"

Last update
resolved

We've confirmed the initial report that there are a small number of recent call sessions that didn't log any metrics data. This can happen if your app has multiple call object instances running on the same page. Your app may do this if you are calling <tt>createCallObject()</tt> more often than you think; for example, in a React effect hook. Multiple call objects usually cause a variety of other errors on the page, so if you aren't already troubleshooting app issues related to this problem, you don't need to be worried about missing metrics. We are adding functionality to daily-js to help customers identify if they have multiple call objects on the same page. If you need help resolving this issue in your app, please feel free to contact support.

investigating

We're investigating reports of missing metrics data in participant logs from a small number of users. This may date back to some time around 2023-01-23 17:00 UTC (9:00 AM PST on Monday, Jan 23).

investigating

We're investigating reports of missing metrics data in participant logs.

Report: "Problems creating raw-tracks recordings"

Last update
resolved

This incident has been resolved.

monitoring

We've deployed new call servers and recording infrastructure to resolve the issue. You should be able to start a raw-tracks recording from any call session that started on or after approximately 02:15 UTC. Existing long-running call sessions may still be running on older call server instances. Those sessions may still experience errors with raw-tracks recordings. Those sessions will automatically move to new call server instances within the next few hours as part of our normal deploy process. We'll resolve this incident when all of the old call server instances have been retired and operations are back to normal.

identified

We're in the process of deploying updates to resolve this issue. We'll resolve the incident as soon as the fix is live in production.

identified

We've confirmed an issue preventing the creation of raw-tracks recordings. Other recording types are unaffected, including "cloud" recordings to your own S3 bucket. If you need to record an important call during this incident, you can change the <tt>enable_recording</tt> property on your domain, room, or meeting token to <tt>cloud</tt> to make a cloud recording.

investigating

We're investigating reports of errors from customers trying to create "raw-tracks" recordings.

Report: "Intermittent issues with cloud recordings"

Last update
resolved

This incident has been resolved.

monitoring

We are experiencing an issue where cloud recordings are intermittently returning all black frames. We have pushed a fix to production and are currently monitoring the situation.

Report: "Intermittent issues starting cloud recordings and livestreams"

Last update
resolved

This incident has been resolved.

monitoring

Daily’s auto scaling system experienced a failure to communicate with some internal services, preventing it from adding capacity for cloud recording and live streaming quickly enough to keep up with demand. We resolved the issue, and we're monitoring platform operations to ensure that everything has returned to normal.

monitoring

We have pushed a fix to production and are currently monitoring the situation.

identified

We have identified the issue and are currently testing a fix.

investigating

We are experiencing an issue where customers attempting to start livestreams or cloud recordings are intermittently receiving a temporarily-unavailable error.

Report: "Problems joining rooms in us-east-1"

Last update
resolved

This incident has been resolved.

monitoring

We identified a brief issue with DNS while deploying a call server (sigh, it's always DNS). This would have caused intermittent join problems for some users for several minutes. We resolved the issue, and we're monitoring platform operations to ensure that everything has returned to normal.

investigating

We're investigating an issue preventing some users from joining meetings hosted in the <tt>us-east-1</tt> region.

Report: "Issues connecting to rooms"

Last update
resolved

AWS has resolved their issue, and our operations have returned to normal.

monitoring

We're still watching the ongoing AWS status incident until it's resolved. We'll provide another update if anything changes in the meantime.

monitoring

AWS has acknowledged an issue with API Gateway in the <tt>us-west-2</tt> region. We're routing API requests to other regions for now, so everything should be operating normally for you and your users. We'll leave this issue open until AWS has resolved their underlying issue and our health checks return to normal.

identified

We're routing around a possible networking issue to our API gateways in <tt>us-west-2</tt>. This should allow your users to connect to calls, but we're still watching for other networking problems or follow-on effects<a href="https://news.ycombinator.com/item?id=33010341" target="_blank">.</a>

investigating

We're investigating an issue preventing some users from connecting to rooms in the <tt>us-west-2</tt> region.

Report: "Problems connecting to rooms"

Last update
resolved

We've re-enabled our API Gateways in us-west-2, and users are connecting to rooms successfully. This incident is resolved.

monitoring

We've confirmed that the issues with joining calls were a result of an AWS incident posted on their status site. AWS has resolved that incident, and we're seeing successful responses from our us-west-2 resources in our staging environment. We should be re-enabling our us-west-2 API Gateways shortly.

investigating

We've temporarily removed our affected us-west-2 API Gateways while AWS works to resolve the underlying issues. That should solve the problem that was preventing users from joining calls. Existing call sessions should be unaffected. We're closely monitoring other parts of our infrastructure, and we'll provide updates here if further issues emerge.

investigating

We are investigating an AWS issue with API Gateways in the us-west-2 region that is preventing some users from joining Daily sessions.

Report: "Users may be unable to view meeting session data in dashboard"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been deployed, and dashboard users should now be able to access meeting session data. We're continuing to monitor the situation.

identified

We've identified an error impacting the ability for some users to view meeting session data in the Daily Dashboard. A fix is being implemented.

Report: "Customers may experience difficulty downloading meeting recordings."

Last update
resolved

The issue impacting downloads of meeting recordings is resolved.

monitoring

Daily has resolved the issue impacting downloads of meeting recordings, and continues to monitor the situation.

identified

Daily has identified a problem that is impacting the ability to download meeting recordings via the dashboard and access-link APIs, and is implementing a fix.

Report: "Connectivity issues"

Last update
resolved

Error rates and network metrics have returned to normal levels, so we're considering this issue resolved.

identified

We've seen overall error rates decrease as AWS has been working to resolve the networking issue. Things are improving, but you may still experience delays and errors until this issue is fully resolved.

identified

We're continuing to see issues across our platform as a result of the ongoing AWS outage. You'll likely experience problems joining calls, accessing the Dashboard, or using the REST API. We'll continue to post more information here as we have it.

identified

We're experiencing network delays and timeouts throughout our infrastructure as a result of a larger-scale AWS incident. You may experience problems connecting to calls, viewing your Dashboard, or making REST API requests. We'll update this incident as we know more.

investigating

We're investigating reports of problems connecting to calls.

Report: "Degraded audio and video call experience"

Last update
resolved

We have restored our service provider configuration to its nominal state after confirming that all providers are operating normally.

monitoring

We have temporarily routed traffic through another service provider, which should resolve call connection issues for most users.

identified

We are continuing to work on a fix for this issue.

identified

We are currently experiencing an issue with one of our service providers that may be affecting connection to calls (slow connections or timeouts). We are implementing a fix.

Report: "Audio and video calls impacted by an ongoing incident in us-east-1"

Last update
resolved

AWS has resolved the regional issues in us-east-1. We have re-activated the us-east-1 region for new audio and video calls.

monitoring

We have attempted to work around this AWS issue for users that would normally be routed to our us-east-1 resources by temporarily removing all of our us-east-1 DNS records from our AWS API Gateway configurations whilst we wait for AWS to recover the region. We'll continue to monitor the situation.

investigating

Users close to the us-east-1 (Northern Virginia) region of AWS may be unable to join video calls. We are investigating.

Report: "Internet connectivity - South America region"

Last update
resolved

This incident has been resolved.

monitoring

Connectivity issues in the South America - Brazil region have been resolved. We are continuing to monitor the situation.

identified

AWS is experiencing intermittent Internet connectivity issues in the South America - Brazil region. Users may experience a degraded experience connecting to audio and video calls in the region. Connecting to calls may take longer, and users in the region may connect to a server in another region until connectivity returns to normal.

Report: "Dashboard degraded: logging and telemetry impacted"

Last update
resolved

This incident has been resolved.

monitoring

The impacted data repository has returned to normal operation. Audio and video call logs and metrics should now be available in the dashboard.

identified

We have identified a problem retrieving audio and video call logging and telemetry from the dashboard. Customers may experience a 'Session not found' error message when attempting to view call logs and telemetry.

Report: "Call Telemetry degraded"

Last update
resolved

We have confirmed that call telemetry data is flowing nominally into the data repository.

monitoring

A resolution has been implemented and telemetry data from ongoing calls should appear in the dashboard.

identified

We have identified a problem with our call telemetry data repository and are working on a fix. Telemetry data from some calls may not appear in the dashboard. Audio and video call experiences are not impacted.

Report: "Degraded Call Experience"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

The fix for the degraded call experience has been deployed. Daily is monitoring in-call activity.

identified

Intermittent degraded call experience or difficulty connecting to calls in some geographic regions. A fix is currently being deployed

Report: "Degraded dashboard and REST API performance"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We're experiencing higher than normal CPU utilization on our database. You may experience slower than normal Dashboard loading and REST API requests.

Report: "SSL Certificate Expiration"

Last update
resolved

On August 31, 2020, at 05:28:23 (all times UTC), the Secure Sockets Layer (SSL) certificate used to secure connections to Daily's Selective Forwarding Units (SFUs) expired. The expired certificate meant that client connections to these servers no longer worked, and had the following impact: - Meetings using our WebSocket signaling option did not function. - Regardless of the signaling option, meetings that use SFU rather than peer-to-peer media delivery did not function.