Historical record of incidents for Ontraport
Report: "HTTPS Outage"
Last updateAll impacted services appear fully functional and are operating normally. The underlying cause was a cache server stall which prevented properly loading HTTPS certificates for a brief time. We will be fully investigating the cause, and implementing some changes to this part of the infrastructure to help prevent this issue from occurring in the future.
Services have been restored. We are going to be observing and monitoring this to ensure it's fully functional.
We are currently fixing an outage that impacts HTTPS certificates for various types of web pages.
Report: "HTTPS Outage"
Last updateAll impacted services appear fully functional and are operating normally.The underlying cause was a cache server stall which prevented properly loading HTTPS certificates for a brief time. We will be fully investigating the cause, and implementing some changes to this part of the infrastructure to help prevent this issue from occurring in the future.
Services have been restored. We are going to be observing and monitoring this to ensure it's fully functional.
We are currently fixing an outage that impacts HTTPS certificates for various types of web pages.
Report: "Investigating outage"
Last updateThis incident has been resolved.
Pushed out a fix
We think we know where service is having issues. Looking into more on way to fix.
We are currently investigating this issue.
Report: "App Down"
Last updateDatacenter was doing some maintenance on their switches and messed something up. They have resolved the incident and we're seeing network being back to normal operations. Will follow up with them to get more details. In terms of OP stuff. No issue was localized to us just network connectivity. Any scheduled mail would have been queued and at this point been sent out.
Network seems to be a stable again. Our servers and systems were fine but the network from Datacenter was flapping. Monitoring to see if it stays stable.
We are continuing to work on a fix for this issue.
Looks like datacenter is having some network flapping issues. Reaching out to them to see if they can resolve
We are currently investigating some network connectivity issues
Report: "HTTP/S Outage"
Last updateWe are confident that this issue is resolved and is related to physical maintenance (hardware installations) at our datacenter. Automated (hardware level) pathing failover did not occur as expected, which resulted in a need for manual intervention. We will examine this in more detail to identify any areas for improvement.
Services have been restored. We are continuing to investigate and monitor this to identify what exactly happened.
We are currently working to resolve a global HTTP/S outage. The issue has been identified and we are working to restore services.
Report: "HTTPS Certificate Issue"
Last updateAll residual domains have been processed. We will continue to monitor this moving forward, but this specific incident is resolved. As mentioned before, if you are having issues regarding this problem, please contact support.
The backlog processor has cycled through 100% of the domains. We will continue monitoring this issue and implement fixes and safety measures where necessary to ensure this does not happen again. If you are still experiencing certificate expiration issues, please contact Ontraport Support.
The backlog process is at 98% complete.
The backlog processor is 90% complete.
The backlog processor is 81% complete.
The backlog processor is 72% complete.
The backlog continues to process and is at 62% complete.
The renewal processor is continuing to run through the backlog. The backlog is currently 30% complete at the time of this update. While an exact ETA is not known, we are estimating another 12-16 hours for the backlog to be fully processed.
An isolated grouping of HTTPS certificates have failed to renew resulting in browsers displaying "insecure site" messages. The primary affected certificates are clustered around the "ontralink.com" tracking link domain; there are also some non-tracking domains impacted. The "ontralink.com" certificates have been repaired/renewed (roughly 99%), and the repair/renewal for other impacted domains is already underway (roughly 25% completed at the time of writing). While the renewal queue is processing, if you need an immediate fix, contact Ontraport Support and they can fast-track domains upon request. This does not affect any new certificate issuances and should not affect certificates issued within the past 80 days. We are continuing to investigate the root cause and will update this issue as more progress is made. This notice is backdated to the earliest observed instance.
Report: "Forms down"
Last updateForms.ontraport.com had an issue where forms were no longer processing, resulting in a 502 error. Downtime was between 12:41 PM PST and 1:34 PM PST 2/3/2025
Report: "App Down"
Last updateIncident has been resolved. A failure with our main cache and failover did not take. Should be fully operational now
We are continuing to monitor for any further issues.
Getting closer. Working on brining services back up
We think we see the issue. Working on resolution
Having issues loading app. Investigating cause
Report: "Email Delivery Issue"
Last updateThe issue has been resolved and emails are flowing as expected. Any emails sent during the outage window will send (or already have).
Getting a little closer. Seems like they changed their transit carrier. Still having connectivity issues though.
We are waiting for the underlying carrier to resolve the issue. We will provide an update when there is more information.
We are waiting for the underlying carrier to resolve the issue. We will provide an update when there is more information.
We are currently working to resolve an issue regarding Email Delivery. Intermediary network services related to our Email Delivery edge network are experiencing routing issues which are preventing timely delivery of emails. No emails have been lost, but there is a delay in delivery of emails.
Report: "Email delivery latency"
Last updateEmail Delivery services have been fully restored and are stable.
We are still waiting for the final "OK" from the DC, but email services are currently restored. Any queued email (those which were scheduled during the outage, or processed during) are now exiting the queues and being delivered. We will continue to monitor this until it is confirmed 100% resolved, and will update the status accordingly if the services deteriorate again.
We are still waiting for a resolution from the DC.
We continue to wait for resolution from the DC. As mentioned, emails continue to queue and process but are unable to be delivered. When the DC is back online, the emails will complete the sending process.
We are still waiting for resolution from the DC regarding the outage.
After restoring services, the remote DC has once again gone down. Awaiting resolution from the carriers/providers involved.
Remote DC is back online, and most impacted services are working normally. We'll continue recovery and update when everything is operational again.
Emails will continue to be queued and processed, but will experience a delay in delivery. Once the issues at the remote datacenter is resolved, the emails will send normally.
Updating to a major outage as it appears all access to the datacenter in question has been lost.
We are experiencing email delivery latency due to an issue related to networking at a datacenter. We are investigating the issue.
Report: "Isolated Account Latency/Errors"
Last updateServices have recovered; we will continue to monitor performance.
We are seeing increased error rates and latency for isolated accounts. Recovery is already underway and we are not expecting a long impact window.
Report: "High latency and timeouts"
Last updateAll accounts and services appear recovered. If you continue to have issues, please contact support.
All accounts should be in functional order. We will continue to monitor to ensure there are no further issues.
We have identified the issue and are working on a full resolution. Services for almost all accounts have already been restored.
We are currently investigating an issue regarding timeouts and high latency.
Report: "Database Latency"
Last updateThe issue appears to be resolved and stable. We are going to continue to investigate this to more thoroughly apply a fix.
The issue should be resolved, but we are continuing to monitor.
We are seeing some increased latency within our database infrastructure that may result in errors for isolated accounts. We are working to restore performance.
Report: "Yahoo Delivery Delays"
Last updateYahoo has indicated that the issue on their end has been fixed. Any queued emails on our side will send as normal once the existing queues have cleared.
Yahoo is experiencing issues in their email systems resulting in emails sent from our system being delayed. Emails are not being lost, they are merely delayed. We will continue to monitor this issue.
Report: "Isolated DB Slowness"
Last updateThis issue has been resolved.
We are continuing to monitor for any further issues.
Fixes applied are having the intended affect. Most accounts impacted by this event are operating as expected. We will be continuing to monitor this situation to ensure that services are running optimally.
We are continuing to address the latency issues associated with the affected accounts. Fixes appear to be working, and will continue to be applied.
Some clients may experience in-app slowness. This issue is isolated to a limited number of accounts. We are working on a solution.
Report: "Database Issues"
Last updateThis issue has been resolved.
Services have been restored, and we are continuing to monitoring.
We have identified a fix and are working on implementation.
Some database services are experiencing issues that are impacting some accounts. We are currently working to resolve the problem.
Report: "Intermittent Service Outage"
Last updateFrom 2024/04/23 16:21 - 2024/04/23 17:25 we experienced intermittent service failures involving the following: * Forms submissions * Link click redirects * Page visit tracking * Loading of in-app collections (Contacts, etc) The underlying issue in the tracking subsystems that cause this outage have been repaired and hardened against future failure.
Report: "Isolated DB Issues"
Last updateDatabases and services appear to be running normally.
We will continue to monitor this and further identify the source of the issues, but services should be restored.
A fix has been applied and we are running data verification now.
We are currently investigating some database related issues. This may be affecting isolated accounts and services.
Report: "Connectivity issues."
Last updateIncident was a carrier issue going into datacenter. Was short lived and seems resolved now
We are currently investigating connectivity issues.
Report: "Database Maintenance"
Last updateNo further issues have been observed. Closing this incident as resolved.
We are continuing to monitor for any further issues.
We have identified some additional issues and are applying fixes. Issues are, currently, isolated to specific accounts and are being addressed.
Servers have been restored and have been stable; we will continue monitoring this issue.
A database update has failed resulting in isolated accounts being impacted. We are aware of the issue and recovery is already underway.
Report: "Database Issues"
Last updateThis incident has been resolved.
We have released some fixes that should prevent this from occurring again. As before, we will continue monitoring, but has been operational since the last update.
Recovery has been completed. We will continue to monitor the impacted systems until we are certain that the issue has passed.
We are seeing some database related issues that may be preventing access to some accounts. The issues have been identified, and restoration efforts are underway.
Report: "System Outage"
Last updateThis incident has been resolved.
A database failure resulted in lack of access to the Ontraport application. While services are currently restored, we will continue to monitor for any lingering issues.
Report: "Login Challenges"
Last updateHad an issue with our main login handler causing accounts unable to login
Report: "Slow campaigns and nightly subscription processing"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We think things are speeding up again
We have noticed some slow processing all day that has across our db stack. Team is troubleshooting, things that are queued will go out but just taking longer to get through backlog
Report: "Increased error rates and slowness for isolated accounts"
Last updateThis issue has long been resolved; the "monitoring" status was left up erroneously.
Services appear to be running optimally. Resolving this issue, but will continue to monitor for any changes.
Services have been restored, we will continue to observe performance and stability.
The impacted database services are currently undergoing automated verification.
We are bringing some degraded services back online currently. Resolution is expected soon.
Currently addressing a serious issue with some underlying database hardware.
We are continuing to monitor the impacted accounts.
The majority of issues have been resolved. If you believe that your account is impacted, please contact support@ontraport.com. This issue will be left in "minor" and "monitoring" states until we are certain that per-account issues have been resolved.
The majority of issues have been resolved, but if you believe your account to be impacted by this service issue please contact support@ontraport.com for further investigation. Efforts will continue until complete resolution.
Around 23:00 Pacific time, we will be doing some triage that will result in a couple minutes of downtime to address this; while our initial fixes did reduce impact, they were not complete, so some emergency maintenance will be necessary.
The initial fix reduced incidence of problems, but did not fully resolve the issue. Efforts to address this are continuing.
We are continuing to monitor for any further issues.
The fix is being applied and we will be observing if it fully resolves the issue.
We are working on a fix for this and have a path forward. Updates to follow as timelines for implementation are identified.
We are currently investigating an issue that is impacting isolated accounts which result in error messages and slow loading of content. We are aware of the underlying issue and are working to find a fix.
Report: "System Outage"
Last updateThe issue has been resolved. Further investigation shows that this only impacted inbound HTTP traffic; backend processing (automations, emails, etc) appear unaffected.
We are continuing to monitor for any further issues.
Services should be restored. We will continue monitoring the systems to ensure that there's no lingering issues.
We are continuing to work on a fix for this issue.
While performing routine maintenance, an unexpected issue has been encountered that has rendered most services unavailable. We are working to restore services currently.
Report: "DNS Issues"
Last updateThe primary issue has been resolved. We will continue to monitor and mitigate any subsequent attacks.
We have mitigated a large portion of the attack. Services should be returning to normal, and we will continue to monitor the results.
We have identified a possible solution and are working on implementation.
Due to the sustained nature and scope of impact, updating to Major Outage. We are continuing to attempt mitigation of the attack, and will update the status as progress is made.
We are continuing to work on a fix for this issue.
We are currently working on a fix for what appears to be a DDoS on one segment of our DNS infrastructure. This could have an impact on many services.
Report: "Isolated Email Delivery Latency"
Last updateDelays are gone, but we will be continuing to refine the systems handling the impacted sending segments.
Queues are returning to normal, we will continue to monitor this event and implement further fixes if necessary.
We are currently investigating delays in delivery for isolated sending segments.
Report: "Delays on some email sends"
Last updateIssues should be resolved; we will continue to observe the delivery latency and act accordingly.
Email queues are draining, and we are continuing to make changes to reduce latency events like this one.
We are currently looking into an issue with sending delays for isolated email segments. Emails are sending, but might experience a delay.
Report: "Bad Code Push"
Last updateAs we continue to create and add new capacity and features to our infrastructure, we routinely integrate completely new and tested systems. In this case, one of our production-ready test systems received a bad code push which resulted in some strange in-app behaviors for some portion of the logged in community. The servers were immediately removed from rotation, and any potential bad code was purged from various production pathways. We have since updated our integration process to catch some additional areas where old could could be blended into production pathways.
A bad code push on a production ready testing system resulted in some strange in-app behaviors for some users. "Automations" changing back to the legacy name of "Campaigns", for instance.
Report: "Automation Failures"
Last updateWe resolved the issue and added some additional recovery processes to prevent future occurrences.
Some automation processors failed to execute correctly, resulting in a potential loss of (or failure to execute) automation processes. The impact would have been effectively random, but would likely manifest in specific automation events not firing as expected during the impact window.
Report: "Increased Error Rates: Timeouts"
Last updateWe identified a rare failure mode that can happen during our routine code release process. We fixed the issue, and are placing additional monitoring on this, as well as updating the release process to catch the failure mode sooner and prevent it from manifesting.
Our servers started reporting an increased rate of timeout related errors. This would have shown as non-responsive content/pages in various areas of our platform. These failures would have impacted potentially 15-20% of web requests.
Report: "Wordpress Site Hosting"
Last updateA critical file system integrity and security check was triggered which automatically, and disruptively, began a deep inspection of all hosted site contents. The nature of the check is not one of suspected compromise, but was the result of a broken/incorrect integrity check. This triggered a routine check of the systems, but due to the scope of the scanning, resulted in a nearly 2 hour long window where services on one of our servers was unavailable.
Some of our hosted wordpress sites are currently offline. We are investigating the nature of the failure, and working to bring them online again.
Report: "Increased Error Rates"
Last updatePrimary issues have been resolved, and we are closing this incident. We will continue to observe the systems and services for any lingering or related issues. The cause of the issue was an unexpected series of events that led to a test suite being executed against a couple of critical subsystems. This resulted in an outage, and in a couple cases, a need to restore some data from backups. This did not affect client data as it was targeted against a couple of administrative systems. Constraints have already been put in place to limit the ability for tests to be run in this fashion. A full audit of access control systems will follow.
We are currently investigating an issue impacting various parts of our service, and accounts. We will update as we know more.
Report: "Systems Failure"
Last updateOne of our physical servers experienced a low-level networking failure related to a very rare expression of a bug in the kernels network modules. This error, unfortunately, took down several critical subsystems that are required for proper functioning. As a result of this failure, we are working toward re-evaluating the distribution and redundancy of the impacted systems so that this rare, but ultimately quite large, failure mode will not be encountered in the future.
On 2022/05/14, 22:48 PDT we experienced a loss of critical hardware due to a low level networking failure. This failure resulted in several critical subsystems being taken offline for an extended period of time, which resulted in many services being unavailable.
Report: "Landing Page Hosting Issue"
Last updateAs we continue to update and upgrade our infrastructure, we are sometimes required to migrate data to newer systems. One of our centralized components having to do with landing page hosting was moved without issue, but a now-legacy \(but still in use\) component of our code was not able to properly interact with the new systems. This communication issue resulted in some accounts being unable to properly host, and possibly view, landing pages. Once the issue was discovered, it was quickly resolved and will be reviewed more in depth as we continue our upgrade and update process.
The release seems to have resolved the issue with landing page hosting for the impacted accounts.
We are currently investigating, and fixing, an issue with isolated landing page hosting issues related to recent changes regarding backend infrastructure. A hotfix is already prepared and being pushed.
Report: "DDoS Attack"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
It looks like the bulk of the efforts of the attacker have been mitigated. At this point, it'll be continued monitoring, and further hardening of our systems and services to prevent future events.
We are continuing to monitor the attacks, and continue to mitigate the impact.
We will continue to monitor this, and consider it an ongoing event. As the attack persists, we will continue to mitigate. We expect this to continue for some time.
We are continuing to investigate this issue.
We are currently experiencing a DDoS attack and are working on mitigating it. More to follow.
Report: "Services Unavailable: Cloudflare Outage"
Last updateIt looks like Cloudflare has resolved, or mostly resolved, the issue. It appears there was a backbone issues on their end that broke access from several locations. It wasn't worldwide, but was certainly wide spread. We will continue to monitor for issues, but the majority should either already be resolved or will be resolved shortly.
Cloudflare is currently experiencing an outage which is, in turn, taking some of our services down. We are monitoring the situation, and awaiting more information. Additional information about the Cloudflare outage can be found here: https://www.cloudflarestatus.com/
Report: "Database Issue"
Last updateAt approximately 08:58 PST, one of our database servers experienced an unexpected restart event. This resulted in an isolated number of accounts losing some account functionality. The infrastructure team was alerted immediately to the loss and began standard recovery procedures, which resulted in services being restored fully by 09:28 PST. The nature of the event, while understood, has a yet unknown cause. Issue tickets with the hardware vendor have been opened and we will continue to investigate the root cause. Logs, monitoring, and hardware events do not indicate the source of the restart; as such, this will be continued investigation until we are satisfied that we understand the full sequence of events.
The systems and services look like they are working normally, and no additional errors are being observed or reported.
Services have been restored, we will continue to monitor and address any lingering issues as they appear or are identified.
We have identified the issue and are verifying fix.
We are currently investigating a database issue that may result in account/service disruption for some accounts.
Report: "Service Issue"
Last updateWe will continue to monitor our systems and services regarding this, but are considering this issue resolved. A critical network device failed and failing to redundant systems did not fully address the problem. The device has been replaced and services restored and we will attempt to implement additional fail over pathways to prevent future issues.
We are continuing to monitor for any further issues.
We have implemented a fix and will continue to monitor; services should be restored.
We are continuing to work on a fix for this issue.
We have identified the source of the issue and are working on a fix.
Services remain impacted, and we continue to investigate.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
An issue of unknown severity and origin has occured and is currently being investigated.
Report: "Isolated DB Issue"
Last updateMaintenance is complete. Any impacted accounts should now be operational.
One of our database servers is undergoing emergency repairs. This affects a limited amount of accounts.
Report: "Email Delivery Delay"
Last updateWe are experiencing a delay in processing our Email Delivery queues for random segments of traffic. The underlying issue has been resolved, but emails may experience a delay in sending (not presented in the Ontraport application). Due to the nature of the delay, it is not known which specific sends are affected, but not emails have been lost.
Report: "Email Delivery Delays"
Last updateWe have not observed any abnormal delays since applying the last round of fixes, and emails are flowing normally.
We are continuing to monitor for any further issues.
The queues and overall delivery latency remain in an acceptable state. As before, we will leave this issue open and continue monitoring for abnormalities in delivery times.
The recent changes made to appear to have addressed the bulk of the latency. While some emails may still experience latency, the majority of emails are flowing normally. As before, we will continue to monitor this ongoing issue and remedy as needed.
The underlying issues causing the queuing have resurfaced. We are mitigating on our end where possible, while the issues external to Ontraport's systems are being addressed.
We are continuing to monitor for any further issues.
Email queues have been drained and are flowing normally. They returned to normal several hours ago, but we are going to continue to monitor this issue over the next several days.
We have identified the issue and are working to mitigate. The queues are decreasing in size and the bulk of email is being processed as received. While there will continue to be some delays, they should be reducing over time until it returns to normal.
We are currently investigating abnormally large queue times within our email processing systems. This issue can result in delays in email sends.
Report: "HTTPS Certificate Maintenance"
Last updateMaintenance has been completed.
In preparation to update our certificate issuance process, the automatic HTTPS certificate generation system will be delayed slightly. This should only impact HTTPS for new pages on previously unhosted domains (ie, those that do not already have HTTPS certs). The maintenance window should be brief.
Report: "Database Issues"
Last updateThe repairs appear to be holding. We will continue to monitor the impacted services for any further issues, but the core issue is resolved.
Services have been restored, and we will continue monitoring for any residual issues.
Repairs to the databases has begun, and should be resolved shortly.
We are currently investigating an issue with database servers that are affecting a small subset of accounts. The issue is known, and a fix is being implemented.
Report: "Email Send Delay for Isolated Segments"
Last updateQueues have cleared. After several hours of monitoring, the issue largely appears to be related to typical throttling on some of the older-engagement segments. We will update some of the pathways to try to prevent future occurrences.
We are seeing delays in sending regarding some email send segments, notably with contacts that have not engaged in 120+ days; the largest delayed segment is 1 year+ with no engagement. The bulk of the delays appear to be caused by ISP throttling. Other sends are not particularly impacted. We will continue to monitor this as the day moves forward.
Report: "Potential DNS DDoS"
Last updateThis incident has been resolved.
The primary issues appear to be resolved, but we are going to continue monitoring and hardening our systems. All affected services should be restored and functioning normally.
We are working to mitigate the effects of the traffic, but it is causing some isolated disruption.
We are currently investigating what appears to be a DNS DDoS attack some of our infrastructure. This may result in slowness in loading pages for effected zones, or browser error messages along the lines of "We're having trouble finding that site." We will update with more information as it becomes available.
Report: "Database Issue"
Last updateDatabase issues have been resolved. Affected accounts are currently active and fully functional.
Database issues for affected accounts should be resolved. We will continue monitoring as usual. If you observe any issues, please reach out to support@ontraport.com.
Database failure has been confirmed as critical. We are currently working to restore functionality.
We are continuing to address the impacted database infrastructure.
We are continuing to work on a fix for the impacted databases.
We are currently experiencing an issue with part of our database infrastructure; we are working on a fix.
Report: "Database Issue"
Last updateIssue has been mitigated. All accounts affected have been moved to a new more performant box
We are continuing to work on the issue.
We are continuing to work on a fix for this issue.
Experiencing an issue with the same part of our database infrastructure that had issues earlier; We are working on a fix.