Historical record of incidents for WorkOS
Report: "Elevated errors in AuthKit"
Last updateWe are investigating an increase in errors with AuthKit and Admin Portal.
Report: "Audit logs failing to load on Admin Portal"
Last updateThis incident has been resolved.
A fix is deployed. Continuing to monitor.
We are currently investigating an issue with our Admin Portal that is affecting only the Audit Logs product. We will share more information as soon as it's available.
Report: "Increased error rates authenticating with AuthKit"
Last updateThis incident has been resolved.
We have deployed a fix and are monitoring.
We are investigating intermittent errors authenticating with AuthKit. This also impacts authentication to WorkOS Dashboard.
We are investigating intermittent errors authenticating to the WorkOS Dashboard.
Report: "Elevated AuthtKit redirect errors"
Last updateFrom 2025-04-19 02:11 to 2025-04-23 20:18 UTC some users experienced issues with redirects upon successful authentication. This issue has been resolved.
We have addressed the issue and are monitoring.
We have identified the issue and are working on a fix.
We are investigating errors with non-default redirects in AuthKit authentication flows.
Report: "Elevated errors authenticating with AuthKit"
Last updateOn 2025-04-17, from 21:30 UTC to 22:13 UTC, some users may have experienced issues authenticating with SSO or OAuth providers. This issue has been resolved.
We've fixed the underlying issue and are monitoring.
We are investigating elevated errors during authentication with AuthKit. Users may be unable to login.
Report: "Webhook Delivery Delays"
Last updateAll backlogged webhooks are delivered. This incident has been resolved.
We’ve identified the issue and are seeing improvements in webhook delivery. Some delays may still occur as we continue to monitor and work toward full resolution.
We’ve identified the issue and are seeing improvements in webhook delivery latency. We’re continuing to monitor to delivery.
We’re currently investigating an issue causing delays in webhook delivery.
Report: "Elevated AuthKit and Dashboard errors"
Last updateThis incident has been resolved.
On 2025-03-18, from 08:11 to 09:07 UTC, users may have experienced errors when trying to authenticate with AuthKit. Users authenticating to the WorkOS Dashboard may also have experienced errors. This issue has been resolved.
Report: "Increased errors"
Last updateWe have continued to monitor throughout the day and have not seen any further issues.
All services have recovered. We are continuing to monitor closely to ensure full resolution of the issue.
We have identified an issue that is causing failures with AuthKit, SSO, Audit Logs, and the WorkOS Dashboard. We are working to mitigate the issue as quickly as possible. We will provide another update within 15 minutes.
We're investigating reports of increased errors and timeouts.
Report: "Intermittent timeouts"
Last updateWe have mitigated the issue that was causing sporadic timeouts.
We are investigating intermittent timeouts with some API operations, and are working on mitigations within the affected systems.
Report: "Elevated Error Rates"
Last updateThis incident has been resolved.
We are seeing elevated error rates across our services. We will share an update as soon as possible.
Report: "Increased API Latency"
Last updateThis incident has been resolved.
A fix has been applied and latencies are coming back down to normal levels.
Our network provider has identified the issue and is working on a fix.
We are working with our network provider to investigate the cause of the increased API latency.
We're investigating the issue.
Report: "Elevated Error Rates"
Last updateThis incident has been resolved.
Our team has identified the issue and implemented a fix. Services are returning to normal.
We are seeing elevated error rates and request timeouts across our services. We will share an update as soon as possible.
Report: "Issue with Magic Auth emails with new user invitations"
Last updateThis incident has been resolved.
The issue has been identified and we are currently implementing a fix.
We are currently investigating the issue. Magic Auth for sign-in and sign-up flows is not impacted.
Report: "Custom domain creation errors"
Last updateThis incident has been resolved.
The issue has been addressed at our upstream provider and we are monitoring.
We've identified the issue and working with our upstream provider on the fix.
We are investigating errors when creating custom domains.
Report: "Audit Logs Ingestion Issues"
Last updateAudit log functionality has been restored
Ingestion of audit log events has returned to normal. Our team is continuing to monitor processing.
We are continuing to monitor Audit Log ingestion.
Our team has identified the issue and has rolled out a fix. Audit Log ingestion is returning to normal.
We're currently investigating an issue with ingestion in our Audit Logs API.
Report: "Issues with AuthKit"
Last updateAuthKit functionality has been restored
A fix has been rolled out and systems are returning to normal. We are continuing to monitor functionality.
We've identified the issue and have rolled out a fix.
Our engineers are investigating an issue with logging in through AuthKit.
Report: "Delay in Webhook delivery"
Last updateAll systems have returned to normal.
From 2024-11-13 12:20 to 13:50 UTC, webhook event delivery was delayed for a few customers. This issue has been resolved and all delayed webhooks have now been sent.
Report: "SSL validation errors with authkit.app domains"
Last updateThis incident is now resolved.
We have deployed a fix and are monitoring to ensure the SSL issue is fully resolved.
We are currently investigating reports of SSL validation errors affecting authkit.app domains.
Report: "Increased latency"
Last updateThis incident is now resolved. Latency remains at normal levels and most background jobs are being processed normally. Customers may still notice delays with some delete operations as we process the last of the backlog.
The fix has been deployed and the system is recovering. Latency has returned to normal levels, but certain record updates may be delayed as we work through a backlog of queued changes.
Some customers may be experiencing an increase in system latency. We have identified the cause of the issue and are deploying a fix.
Report: ""Invalid Signature" errors for some SAML connections"
Last updateThis issue has been resolved.
We have deployed a fix and we are monitoring the system to confirm that the issue is fully resolved.
We are aware of reports of "Invalid Signature" errors for some SAML connections. We have identified the cause of the issue and are working on a fix.
Report: "Issue with new SFTP Directory Sync connections"
Last updateThis incident has been resolved.
We have identified an issue that is preventing Directory Sync users from creating new SFTP connections. We are currently working on fixing the issue.
Report: "Delay in Sending Emails"
Last updateThe issue has been resolved.
From 2024-09-20 12:30 and 13:30 UTC, emails were not being sent and delayed up to 40 minutes. This issue has been resolved and all delayed emails have now been sent.
Report: "AuthKit, Admin Portal, WorkOS Dashboard Unavailable"
Last updateAll systems have returned to normal.
AuthKit, Admin Portal, and the WorkOS Dashboard are unavailable. We are actively investigating the issue.
Report: "Issue with WorkOS Dashboard, Admin Portal, AuthKit, Docs"
Last updateThis incident has been resolved.
We are seeing service availability restored. We are continuing to monitor status from our upstream provider.
We have identified the issue with our upstream provider Vercel. Vercel has implemented a fix.
The WorkOS Dashboard, Admin Portal, AuthKit, and Docs are unavailable. We are tracking an issue with an upstream hosting provider.
Report: "Degraded Authkit Service"
Last update## Summary On 2024-07-11 at 20:00 \(UTC\), AuthKit generated invalid SSO and OAuth authorization URLs for customers using a custom authentication API domain. As a result, end-users encountered a 404 \(not found\) page when attempting to sign in through AuthKit. ## Root Cause Analysis As part of the sign-in flows, AuthKit leverages the WorkOS API \([https://api.workos.com](https://api.workos.com)\). Customers may sign up to use a custom API domain, which serves as an alias for the WorkOS API. On 2024-07-10, we discovered a case where the custom API domain, for those who configured it, wasn’t being used properly in sign-in flows for OIDC connections. On 2024-07-11 we pushed an update to address this behavior and introduced a more severe issue that generated a malformed authorization URL, resulting in a Not Found page for the user. ## Actions and Remediations * **Remediation**: Upon learning of the issue, we promptly rolled back the faulty deploy which resolved the problem. Although the rollback was quick once initiated, identifying the issue took significantly longer than acceptable, resulting in ~32 minutes of unavailability for customers with custom API domains. * **Permanent fix:** The correct fix for the original issue related to custom API domains was deployed a few hours later. * **Improving Test Coverage:** We have identified several areas for improvement in our test suite. We are adding new integration and end-to-end tests, including scenarios for customers with custom API domains configured which will prevent similar issues. * **Enhancing Monitoring**: We are upgrading our monitoring tools to better detect request anomalies and perform more sophisticated automated checks that simulate end-user behavior in critical sign-in flows. This should reduce the time to detect issues. ## Timeline * 2024-07-11 20:01 \(UTC\): faulty code deployed * 2024-07-11 20:28 \(UTC\): issue was acknowledged by WorkOS team * 2024-07-11 20:33 \(UTC\): rollback deployed * 2024-07-11 20:47 \(UTC\): last NotFound error seen \(~2% of the total errors occurred between 20:33 and 20:47, which is attributed to the time needed to invalidate the cached page instances at the edge.\) * 2024-07-12 00:31 \(UTC\): permanent fix was deployed
On 2024-07-11, from 20:01 to 20:46 UTC, users may have experienced an error attempting to authenticate using Hosted Authkit. Only customers using custom API domains were impacted.
Report: "Degraded Authkit service"
Last updateAuthKit service has been restored to full operation.
On 2024-07-02, from 13:25 to 13:45 UTC, users may have experienced an error attempting to authenticate using Hosted Authkit.
Report: "Elevated Dashboard Errors"
Last updateThis incident is now resolved and the Dashboard is operational again.
We've identified the issue and have deployed a fix. We're continuing to monitor the status of the Dashboard.
We are investigating an issue with errors on our Dashboard. We will share an update as soon as possible.
Report: "Elevated Errors with Audit Logs and SSO"
Last updateThe incident has been resolved.
Services have been restored and we are continuing to monitor.
We are continuing to investigate this issue.
We are investigating an increase in errors related to Audit Logs and SSO.
Report: "Some SSO connections incorrectly in validating state"
Last updateOn 2024-06-07, from 20:51 to 21:54 UTC, some teams may have received spurious "connection activated" emails. The connections were already active and sign-in was not impacted. During this same period of time, for those teams using Hosted Authkit with SSO connections, end-users may have seen a sign-in screen that did not include any option to authenticate. This was due to deploying a code change that incorrectly set the state of some SSO connections to 'validating'. All impacted SSO connections have been restored to the correct state of 'active'.
Report: "SSO & Admin Portal Errors for Custom Domains"
Last updateThe impact from the issue has been resolved.
On 2024-5-01, from 12:50 to 19:58 UTC, some users would have received errors when attempting to sign-in via SSO and/or accessing the WorkOS Admin Portal if custom domains had been configured recently. This was due to a deployment that included incorrect configuration for custom domains. The correct configuration has been restored and the issue is resolved.
Report: "Issues with Domain Aliasing"
Last updateThis incident has been resolved.
On 2024-03-28 from 20:19:00 to 21:05:00 UTC User Management Sessions and generated links for Admin Portal and Magic Link emails were incorrectly formed. This issue has been resolved.
We've applied a fix and are monitoring.
We have identified an issue impacting Session tokens and are working on a fix.
Report: "Elevated Error Rates"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
From 10:24 UTC to 11:32 UTC there were intermittent elevated error rates across all products. The incident has since been resolved, and we are continuing to monitor.
Report: "Dashboard unavailable"
Last updateThe fix has been deployed and the Dashboard should be working as expected.
Dashboard is unavailable. We have identified the issue and are rolling out a fix. We will share an update as soon as possible.
Report: "Elevated Errors and Latency"
Last updateThis incident has now been resolved.
From 15:55 UTC to 16:05 UTC, elevated error rates occurred across all products. Developers may have also experienced increased latency and delayed email, SMS, and webhook delivery. This incident has now been resolved, and we are continuing to monitor.
Report: "Audit Logs write failures"
Last updateThis incident has been resolved.
From 16:21:29UTC to 16:22:29UTC audit log create_event calls failed due to an issue with database failover.
Report: "Magic Auth Email Sending Impacted"
Last updateThe issue has been resolved.
From 2024-02-08 22:38 to 2024-02-09 02:11 UTC, magic auth emails were not being sent. This issue has been resolved. All other types of emails were being sent as expected during this period.
We are investigating an issue with the sending of Magic Auth emails. We will share more information as soon as it's available.
Report: "Elevated Error Rates"
Last updateThis incident has been resolved.
From 21:40 UTC - 22:10 UTC, there were intermittent elevated error rates across the WorkOS Dashboard, AuthKit, Admin Portal, and Docs. The incident has now been resolved, and we are continuing to monitor. The WorkOS API was unaffected.
Report: "Elevated Platform Errors"
Last updateThis incident has been resolved.
A fix has been implemented by our upstream provider and we are now monitoring the results.
We have identified the issue and are working with our upstream provider.
Report: "Elevated errors for SSO API"
Last updateFrom 15:26 to 15:32 UTC, users may have experienced an increase in errors with our SSO API. This issue is now resolved.
Report: "Increase in API and Dashboard errors"
Last updateFrom 16:06 to 16:18 UTC, users may have experienced an increase in errors from the API and Dashboard. This issue is now resolved.
Report: "MFA SMS Delivery Failures"
Last updateThis issue has been resolved.
We are observing successful SMS delivery to all US phone numbers, including Verizon and AT&T networks. We are monitoring to ensure continued successful delivery. We apologize for the inconvenience and will share an update once we have more information.
We are investigating an issue with SMS delivery for MFA to US Verizon and AT&T phone numbers. We apologize for the inconvenience and will share an update once we have more information.
Report: "Experience degraded performance for SSO"
Last updateFrom 2023-08-28 23:22 UTC to 2023-08-29 02:22 UTC WorkOS’s SSO product was unavailable to users through custom domains. Requests returned HTTP 403 Forbidden errors. We understand that WorkOS sits on a critical path for our customers’ applications. This is not a responsibility we take lightly and this outage is not in line with the service we aim to provide. We are taking all necessary steps to ensure an incident like this does not happen again. ### Who was affected? The incident affected SSO API requests for users with custom hostnames. Affected requests during this time resulted in 403 errors and displayed the error message “This web property is not accessible via this address.” ### What happened? While performing maintenance on our Web Application Firewall, a new set of rules were applied to production. This change marked some legitimate requests as anomalous. Alerts were not properly configured to notify engineers when seeing a spike in anomalous 4XX traffic. The main factor that led to this incident was improper controls around how production Web Application Firewall changes should be applied. ### What will we do to mitigate problems like this in the future? Moving forward, WorkOS will take the following actions: 1. Establish additional access control policies around applying Web Application Firewall changes. 2. Add monitoring around increases in anomalous traffic. 3. Add monitoring around failures with custom hostnames.
This issue has been resolved.
Services have returned to normal and we are continuing to monitor the situation.
We've identified the issue and are working on a resolution.
We are experiencing a partial outage of SSO. We are currently investigating and will update when we identify the issue.
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
Report: "Issues with Google Workspace Directory Syncing"
Last updateWe have identified the root cause as an issue with the Google Workspace API. Google has identified and rolled back the change that caused this issue. Directory Sync for Google Workspaces is now working as intended. Beginning 19:29 UTC on August 9 2023, some calls to retrieve group and group membership information from the Google APIs began returning incomplete data, which made it appear that some groups and associated memberships had been deleted. These deletions are critical directory events that WorkOS would sync to our customers’ systems. We chose to pause directory syncing for Google Workspaces rather than send these erroneous events. We have validated that Directory Sync events are now accurate and resumed syncing. Google Workspace directory webhooks are being delivered with the correct updates.
We are encountering issues with Google Workspace Directory Syncing such that we are unable to guarantee the accuracy of group and group membership events. We are actively investigating and working towards identifying the root cause, and have escalated to Google and are working directly with them to move towards a resolution. Until this is resolved, we have paused syncing for Google Workspace directories and are not sending webhooks.
Report: "General API Degradation"
Last updateBetween 15:10 and 15:30 UTC we saw elevated API errors. This issue has been resolved.
Starting at 15:10 UTC we saw elevated API errors. We have identified the issue and applied a fix. We are currently monitoring for further issues.
Report: "Email delivery delays"
Last updateWe have resolved the issues with email deliveries being delayed.
We have implemented a fix for email deliverability issues and are currently monitoring.
We have identified the issues with email deliveries being delayed and are implementing a fix.
We're currently investigating an issue with email deliveries being delayed.
Report: "Platform Outage"
Last updateOn 2023-04-11, from 18:14:48 to 18:33:20 UTC, several WorkOS products were unavailable and requests to the API encountered server errors. We understand that WorkOS sits on a critical path for our customers’ applications. This is not a responsibility we take lightly and this outage is not in line with the level of service we aim to provide. We are taking all necessary steps to ensure an incident like this does not happen again. ### Who was affected? The incident affected incoming API requests with impact spanning many of our products. Affected requests during this time resulted in 500 and 503 responses. ### What happened? While performing maintenance, a networking configuration change was inadvertently applied. This change prevented our services from connecting to our storage services such as databases and caches. The main factors that led to this incident were improper controls around testing and applying production networking changes. ### What will we do to mitigate problems like this in the future? Moving forward, WorkOS will take the following actions: 1. Establish additional access control policies around applying configuration changes. 2. Add checks for sensitive actions to production resources. 3. Ensure infrastructure changes are managed consistently by infrastructure-as-code workflows.
The issue is now fixed and we are not seeing elevated errors in our systems
We deployed a fix and are seeing our systems going back to normal.
We identified the issue and are working on a fix. We will provide more updates soon.
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
We are investigating an issue with our API. We apologize for the inconvenience and will share an update once we have more information.
Report: "Sending duplicate update webhooks for users and groups"
Last updateThis incident has been resolved.
We've deployed a fix and are monitoring the situation.
We've noticed a bug in our system that could send a large amount of duplicate `dsync.user.updated` and `dsync.group.updated` webhooks. These updates may put additional strain on your systems processing these webhooks, but should not impact your directory state. We apologize for the inconvenience and are currently deploying a fix.
Report: "Elevated Platform Errors"
Last updateThis incident has been resolved.
The service causing errors has recovered and we are continuing to monitor for further issues.
We've identified an issue with a service that is causing errors to propagate to the WorkOS Dashboard and Directory Sync list users endpoint.
We're investigating elevated platform errors.
Report: "Incorrect redirect when IdP-Initiated SSO is blocked"
Last updateThis incident only affected customers who have IdP-Initiated SSO disabled. From Mar 14th at 3:17pm PT until Mar 15th at 12:23pm PT, users who were blocked from IdP-Initiated SSO were redirected to a WorkOS error page instead of the default redirect_uri provided This issue was introduced during a refactor designed to improve our error experience. IdP-Initiated SSO is disabled for a small subset of our customers, and this issue was not caught as part of our release process. We have since updated the new service to account for this behavior. To prevent similar issues in the future, we’ve made improvements to our testing processes and are also improving our tracking of features that are enabled for a limited set of customers, such as "Disabling IdP-initiated SSO"
Report: "Increased error rate and latency in API and Dashboard"
Last updateBetween 18:46 and 19:14 UTC we saw increased API error rates and latency as well as degraded performance in the dashboard. This issue is now resolved.
Between 18:46 and 19:14 UTC we saw increased API error rates and latency as well as degraded performance in the dashboard. We have deployed a fix and are monitoring.
Between 18:46 and 19:14 UTC we saw increased API error rates and latency as well as degraded performance in the dashboard. We have identified the issue and are deploying a fix.
We are continuing to investigate this issue.
We are currently seeing increased API error rates and latency as well as degraded performance in the dashboard. We are currently investigating this issue.
Report: "Issues with Google Workspace directory sync"
Last updateSyncs for Google Workspace directories have returned to normal. This incident is now resolved.
All Google Workspace directories have been re-synced. We are continuing to monitor the situation.
We have deployed a fix and are in the process of re-syncing the impacted directories.
We have identified the source of the issue and are working to remediate it.
We are investigating some issues with Google Workspace directories not syncing properly. We apologize for the inconvenience and will share an update once we have more information.