Pusher

Is Pusher Down Right Now? Check if there is a current outage ongoing.

Pusher is currently Operational

Last checked from Pusher's official status page

Historical record of incidents for Pusher

Report: "Interruption in Stats Collection and Degraded Datadog Integration"

Last update
resolved

The backlog has fully cleared and Stats Collection and Datadog integration are now operating normally. This incident is now resolved. We apologise for any inconvenience caused and appreciate your patience throughout.

monitoring

Systems are now catching up on the backlog, and Realtime Stats and Datadog integration should gradually return to normal. We are continuing to monitor the recovery closely and will provide another update once the catch-up is fully complete.

identified

The issue affecting Stats Collection and the Datadog integration has been identified and addressed. Systems are now catching up on the backlog, and Realtime Stats and Datadog integration should gradually return to normal. We are continuing to monitor the recovery closely and will provide another update once the catch-up is fully complete. Thank you for your continued patience.

investigating

We are currently experiencing an issue affecting our stats collection systems. As a result, Realtime Stats are currently unavailable. Additionally, our Datadog integration is experiencing degraded performance. Our engineering team is actively investigating the root cause and working to restore normal service as quickly as possible. We will provide updates as we learn more. Thank you for your patience and understanding.

Report: "Interruption in Stats Collection and Degraded Datadog Integration"

Last update
Investigating

We are currently experiencing an issue affecting our stats collection systems. As a result, Realtime Stats are currently unavailable. Additionally, our Datadog integration is experiencing degraded performance.Our engineering team is actively investigating the root cause and working to restore normal service as quickly as possible. We will provide updates as we learn more.Thank you for your patience and understanding.

Report: "Pusher dashboard inaccessible for logins with username/password"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating an issue where the Pusher dashboard is inaccessible to users attempting to login with username/password. Users logging with SSO are unaffected. Pusher products remain functional without any issues.

Report: "Users unable to Log In using Google Login"

Last update
resolved

Our engineering team has resolved this incident. We apologise for any inconvenience this may have caused.

investigating

Our engineering team is currently investigating an issue that is causing errors when users attempt to log in to the Pusher portal via Google. All other login methods are functioning as expected.

Report: "Some customers are unable to downgrade their account"

Last update
resolved

This incident has been resolved.

investigating

We are aware of an issue where customers can't downgrade their account plan. Everything else continues to function normally.

Report: "Beams: intermittent disruption loading accounts from the pusher dashboard"

Last update
resolved

From 5:02UTC until 6:23UTC Beams suffered intermittent disruption loading accounts from the pusher dashboard. This was due to internal services not picking up a new ssl certificate that was automatically issued during this time.

Report: "Increased latency on MT1"

Last update
resolved

This incident has been resolved. The team saw similarities with a previous incident that affected the US2 cluster and acted swiftly to mitigate and prevent further impact. Overall the impact was very small: a few connections saw small delays receiving messages for a very short time.

investigating

We are currently investigating this issue.

Report: "We are experiencing a major outage in our US2 cluster"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented, and we are monitoring the results. WebSocket connections should now be functional.

monitoring

The team has been able to identify the issue and push a fix. We are seeing a full recovery of the services. We will continue to monitor.

investigating

Our team is still investigating the issue within our infrastructure that is preventing us from adding WebSocket capacity to the US2 cluster. This issue is specifically impacting our WebSocket deployments. The majority of clients are currently unable to connect using WebSocket. However fallback mechanisms (SockJS, HTTP streaming, and HTTP polling) are available for connection if they are not disabled in your configuration.

investigating

We are escalating the current issue from a partial outage to a major outage. The majority of traffic in the US2 cluster is currently affected. Our team is fully engaged and working to resolve the issue.

investigating

We are investigating what is causing the issue to our US cluster.

Report: "Inaccurate Account Usage Warnings Reported"

Last update
resolved

We have resolved the statistics computation issue. Connection counts should be correct again and remain accurate going forward. Daily message volume counters will still show inflated usage until the end of the day; customers that are concerned about potential throttling due to breaching daily limits can reach out to customer support to ensure access remains uninterrupted.

investigating

Some customers have received incorrect warnings about account usage or observed usage statistics that appear to be incorrect. We are investigating.

Report: "Statistics Aggregation delayed"

Last update
resolved

This incident has been resolved.

monitoring

Aggregated statistics have been restored. We are monitoring the situation.

identified

The issue has been identified and mitigated. Missing aggregated statistics are being recovered. No data is missing.

investigating

We are currently investigating an issue with aggregated statistics being delayed on the Dashboard.

Report: "Stats collection experiencing an interruption"

Last update
resolved

This incident has been resolved.

monitoring

The problem has been identified and fixed. We are monitoring the results.

investigating

We are investigating why Stats collection has stopped.

Report: "Intermittent failures delivering webhooks on EU cluster"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Experiencing Degraded Performance in EU clusters for Pusher Channels"

Last update
resolved

This incident has been resolved.

monitoring

Resolved. Monitoring the system.

identified

Issue identified and being fixed.

investigating

Impact: We are currently experiencing degraded performance on our EU clusters. As a result, some users are unable to send messages, and delays may be experienced. Affected Product: Pusher Channels Current Status: Our team is actively investigating the issue and working to restore full functionality as quickly as possible. We appreciate your patience and understanding as we work to resolve this issue promptly. Thank you for your continued support.

Report: "Experiencing Delays in Webhook Processing for Pusher Channel"

Last update
resolved

The disturbance have been resolved.

monitoring

Recovered. Actively monitoring the system.

investigating

Impact: We are currently experiencing delays of up to 15 minutes in processing webhooks across our EU, MT1, and US2 clusters. However, we want to assure you that despite these delays, you can still utilize our system and successfully process messages. Affected Product: Pusher Channel Action Taken: Our team is actively supervising and adjusting system modifications to address this issue. We are progressively scaling up the pod count to improve processing efficiency. Additionally, we are conducting thorough analysis of traffic patterns to better understanding. We appreciate your patience and understanding as we work to resolve this issue promptly. Thank you for your continued support.

Report: "SSL certificate renewal for our legacy telemetry endpoint"

Last update
resolved

The SSL certificate for our client stats endpoint expired at 00:00 UTC, but the issue was resolved at 8:49 AM UTC. During this period, clients who opted to send client stats using older versions of the JavaScript SDK for web (PusherJS) encountered some errors and were unable to send telemetry data to our servers. This does not impact any functionalities of our SDK, even in older versions. However, it's worth mentioning that we are unable to verify this on much older versions of our SDKs, as our team no longer monitors them. We have replaced the certificate on this endpoint with an auto-renewable certificate to prevent this issue from happening. The client stats endpoint is a legacy endpoint and is not being used in newer versions of our SDKs.

Report: "High Latency in US2 on Channels WebSocket Client API"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue. Impact is greatly reduced now. Customers are seeing lower latencies.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

Customers are seeing high latency on the US2 cluster (WebSockets). The team is working on it.

Report: "High Latency in US2 on Channels WebSocket Client API"

Last update
resolved

This incident has been resolved.

monitoring

A fix for this issue has been deployed and we are monitoring the system.

identified

We are continuing to investigate issues with users unable to establish connections from the Websocket Client.

identified

Issues with US2 have returned. We are seeing issues maintaining connections.

monitoring

A fix has been implemented and the latency issues have resolved in US2. We will monitor the systems to ensure the issue does not reoccur.

investigating

We are currently investigating an issue with high latency in US2 on the Channels WebSocket Client API

Report: "Service Disruption Notice: Pusher US2 Cluster"

Last update
resolved

On Saturday March 30th from 04:15 UTC until 11:30 UTC Pusher channels experienced a partial outage and higher than usual latencies on our US2 cluster. The incident was triggered by a single, high volume customer sending requests that was causing an authentication error. These errors are collated and stored in our error logging system, which can be viewed on the Pusher Dashboard. Due to the exceptionally high number of errors, the logging system was overwhelmed, and caused all writing to the log system to backlog on our socket processes. This in turn resulted in the socket processes taking longer to process messages, and to start failing health checks. The failing health checks caused pods to restart, which resulted in connections being dropped. We identified this issue and took steps to mitigate, attempting to switch off non-critical components of the system to alleviate stress on the system. Eventually it was clear that these steps were having little impact and we deployed a hotfix to disable the error logging system. During the deployment of this hotfix the system hit a limit on the amount of registered targets on our cloud provider’s load balancer. This caused the rolling deployment to take much longer than expected, as it waited for old socket processes to be fully drained and terminated. After coordination with our cloud provider we were able to increase this limit, allowing the deployment to complete and the incident was resolved. There were 2 main windows of impact From 4:15 till 5:40 UTC and From 7:15 till 9:15 UTC From 9:15 till 11:30 service was operating normally and we saw a ramp up in connections.

Report: "Functions dashboard returns error"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating the issue where on the Dashboard when navigating to the Functions feature an error is shown.

Report: "Pusher.com website down"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Pusher US2 webhooks delayed"

Last update
resolved

We have identified the issue and resolved it. Webhooks are behaving as expected again.

investigating

We are investigating a delay in Webhooks.

Report: "Pusher EU webhooks delayed"

Last update
resolved

All delayed EU webhooks completed processing at approximately 1030 UTC. This incident is resolved.

identified

We have deployed a fix for this issue. The delivery of the delayed webhooks continues.

identified

We are continuing to see delays in webhooks. We are working on implementing an additional fix.

monitoring

A fix has been implemented and we are monitoring the results as the backload is processed off.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Pusher US2 webhooks delayed"

Last update
resolved

All delayed US webhooks completed processing at approximately 730 UTC. This incident is resolved.

identified

We have deployed a fix for this issue. The delivery of the delayed webhooks continues.

identified

The issue has been identified and a fix is being implemented.

Report: "Pusher US3 cluster outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The team identified the problem. Most connections are stable now but some of them may get reconnected still. The team is working on a permenant solution.

investigating

Pusher US3 cluster is experiencing disruptions for client socket connections. Our team is looking into it.

Report: "Channels Stats Missing Data"

Last update
resolved

Between 15:37 - 16:01 UTC we encountered an issue with our stats pipeline that resulted in missing metrics. Customers may notice data missing for this period in the Pusher Dashboard metrics along with any configured integration dashboards(DataDog and Librato). The issue has been resolved and stats are being published correctly since 16:01.

Report: "Public EU cluster outage"

Last update
resolved

This incident has been resolved.

monitoring

The public EU cluster has been restored and all services are now back online. We will continue to monitor the cluster health.

identified

The issue has been identified and a fix is being implemented.

investigating

The public EU cluster is currently experiencing an outage which affects customers in this region. For customers in other regions, there is no impact. We are actively working on restoring the public EU cluster.

Report: "US2 Customers may be experiencing latency issues with or unable to utilize the Channels WebSocket and REST API"

Last update
resolved

This incident has been resolved.

monitoring

The fix has been implemented and latency has improved.

identified

The team has identified the cause of the latency and are actively working to resolve the high latency issues.

investigating

We are currently investigating an issue in US2 where some clients may be experiencing latency issues or unable to utilize the Channels WebSocket and REST API.

Report: "Customers using Datadog exporter feature at pusher channels might experience a small data loss or delay on the dashboards"

Last update
resolved

This incident is being resolved

investigating

Our engineering team has detected an issue that is preventing all data to be published to DataDog. We are currently investigating the root cause

Report: "High latency on channels US2 API"

Last update
resolved

Our engineering team has confirmed this issue has been mitigated. We apologise for any inconvenience this may have caused you.

identified

Our engineering team is working on rolling out an improvement to resolve the high latency on the US2 cluster.

investigating

We are currently investigating this issue.

Report: "A subset of webhooks may not be successfully delivered - EU cluster only"

Last update
resolved

Our engineering team has rolled out a fix and has confirmed webhooks delivery is fully operational on the affected EU cluster.

investigating

Our engineering team is actively investigating an issue that is impacting the delivery of a specific set of webhooks. This is only affecting our EU cluster

Report: "intermittent issues with publishing messages in EU cluster"

Last update
resolved

This issue has been resolved

monitoring

The issue has been identified and a fix is released

investigating

We are currently investigating the issue

Report: "Webhook delivery issue - Channels EU cluster"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We're currently investigating a webhook delivery issue in the EU cluster, impacting a small portion of traffic.

Report: "Channels EU cluster having intermittent issues"

Last update
resolved

The cluster is now operational following the implementation of a fix at 17:13 UTC. Our team is actively monitoring the results. The primary contributing factor was identified as a DNS-related issue. In order to determine the root cause, a thorough investigation will be conducted and a detailed post-mortem will be published.

investigating

We are continuing our investigation into the ongoing issue affecting EU cluster. Our team is actively working to mitigate the problem. While the cluster is slowly recovering, it has not yet reached a stable state.

investigating

We are currently experiencing an outage affecting API, WebSocket, and Webhook delivery services. Our team is actively investigating the issue

monitoring

All systems are now operational. However, clients utilising fallback transports may still experience intermittent issues. We are actively working to address the remaining issues.

investigating

Webhook delivery was temporarily affected, resulting in delays for some customers. Delivery is returning to normal.

investigating

We saw 3 short bursts of errors related to service restarts with limited customer impact in the last 2 hours. We are continuing to investigate the issue.

investigating

We are currently investigating this issue.

Report: "Disruption in Channels stats integration"

Last update
resolved

Between 06:10 and 09:14 UTC, we encountered an issue that temporarily disrupted the export of data to our stats integrations as well as our own dashboard. As a result, customers using our stats integration to relay metrics to platforms like Datadog and Librato may observe a gap in their data, and the same gap may be visible in our own dashboard.

Report: "Marketing website is unreachable for some visitors"

Last update
resolved

The marketing website (pusher.com) should now be reachable for all visitors, and this incident is now resolved. Pusher services were available and unaffected the entire time.

investigating

Our marketing website (pusher.com) is unreachable for some visitors. All Pusher services are unaffected and working as expected. We are investigating the issue with the marketing website.

Report: "Partial outage in eu cluster"

Last update
resolved

Between 14:30 and 14:45 UTC, the Channels API experienced a partial outage. At its peak, approximately 32% of requests to publish a new message failed and most clients had to reconnect.

Report: "Insights and graphs not available in neither customer and admin view"

Last update
resolved

Most customers have had their graphs restored. Over the next few days, the team will persist in their efforts to ensure that all customers' graph issues are resolved, albeit outside the incident process.

monitoring

We've observed an improvement in processing speed, and for numerous customers, the data is now readily available and accurate on the Dashboard. We are actively working on the issue.

monitoring

Some data processing is taking longer than expected. We are actively working on the issue.

monitoring

The data replay is ongoing. We expect it to be completed in the next 6h.

monitoring

We are now replaying old data in order to process it. This may cause intermittent delays in usage stats across the dashboard.

monitoring

The team has restored the service and is currently monitoring

investigating

We are currently investigating an issue that is preventing the telemetry to be pushed to the customer and admin views, making the graphs not available

Report: "Elevated API Errors"

Last update
resolved

This incident has been resolved.

identified

Our vendor continues to investigate and work towards resolution of this issue. We are continuing to see elevated error rates on our public cluster during this time.

identified

AWS has confirmed an issue in the us-west region. We are actively assessing the impact on our customers, who may experience elevated error rates during this time.

investigating

Customers hosted on our us3 cluster are currently experiencing elevated error rates. Our team is currently looking into the issue.

investigating

We're experiencing an elevated level of API errors and are currently looking into the issue.

Report: "MT1 Cluster issues"

Last update
resolved

This issue is now resolved and the service is operational once more. An RCA will be published once we conclude internal investigations into the issue.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are investigating reports of issues with our MT1 cluster.

Report: "MT1 cluster outage"

Last update
resolved

This issue is now resolved, with connections being established successfully across both websocket and fallback protocols. We will share a post mortem when the full investigation is complete.

monitoring

We've noticed some improvements, although the cluster isn't completely stable yet. Our team is working to resolve the issue. We're currently able to accept more connections through WebSocket transport. However, webhook delivery and fallback transports continue to experience higher impact. Our customer support team has been assisting some of our customers in redirecting their traffic to alternative US-based clusters. (US2 and US3)

monitoring

We have implemented a fix and are monitoring progress as we see Websocket connections reliablity improve. We anticipate that our fallback protocols, SockJS and XHR, will improve over the course of the next 30 minutes.

investigating

We are continuing to investigate this issue.

investigating

We are investigating an issue with clients unable to establish a connection to the mt1 cluster.

Report: "MT1 Elevated Errors"

Last update
resolved

This incident has been resolved.

monitoring

We have implemented a fix and are monitoring the impact - so far the system looks operational.

investigating

We are seeing that error rates have increased once more and are continuing to investigate.

monitoring

The cluster has recovered. We are monitoring the results.

investigating

We're experiencing an elevated level of API errors on MT1 and are currently looking into the issue.

Report: "Elevated error rates on mt1 cluster"

Last update
resolved

Between 12:33 and 13:05 UTC customers may have encountered an increase in error rates when interacting with the Channels HTTP API. The error rate has since subsided. We are investigating the platform to identify the underlying cause.

Report: "Partial disruption in DataDog metrics exporter"

Last update
resolved

An incident occurred yesterday during a maintenance operation, resulting in disruptions to our stats exporter. Some customers might have gaps in their DataDog metrics between 18:30 and 19:30 UTC.

Report: "ap3 cluster experience reconnections for transports: sockjs, xhr_streaming, and xhr_polling"

Last update
resolved

The issue has been identified and resolved.

investigating

On ap3 cluster, clients using one of these transports "sockjs, xhr_streaming, and xhr_polling" will experience regular reconnections. We are investigating the issue.

Report: "Issues impacting clients connected to our fallback transports in the MT1 cluster"

Last update
resolved

Between 15:50 to 18:00 UTC, we experienced a DNS related issue that affected a substantial portion of our sockjs traffic. Clients using WebSocket were unaffected by this incident.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Partial outage in MT1 cluster"

Last update
resolved

Between 19:22 and 19:51, approximately 1.2% of messages failed, and a considerable number of our clients experienced re-connections. Our monitoring system experienced partial downtime, leading to confusion in our autoscaling policy. This confusion triggered a scale-down event in the cluster, resulting in reduced available capacity. Our engineering team has taken steps to prevent this from happening again. This incident has been resolved.

Report: "Partial outage in MT1 cluster"

Last update
resolved

Between 14:55 and 15:07 UTC, the Presence and Channel Existence state features in MT1 were affected, particularly for new connections. From 15:15 to 16:23 UTC, a significant number of clients connected through our fallback transports (xhr_streaming and long-polling) continued to experience the same issue. This incident has been resolved.

Report: "MT1 API service degradation"

Last update
resolved

From 13:31 to 13:53 we are aware of an increase in error rates and elevated latencies when interacting with the Pusher Channels API for the mt1 cluster. The issue has since been resolved and investigations into the root cause are underway.

Report: "Customers are unable to access the Pusher Dashboard."

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We've received reports that customers are unable to access the dashboard. Pusher engineers are currently in investigating.

Report: "Mt1 cluster outage - API and Connections unavailable"

Last update
resolved

This incident has been resolved. It started at 10:32 UTC and the fix was deployed at 10:35 UTC.

monitoring

The issue was resolved

Report: "EU cluster outage - API and Connections unavailable"

Last update
resolved

This incident has been resolved.

monitoring

The issue is resolved and we'll keep monitoring

identified

We investigated the issue and deployed a solution.