Historical record of incidents for Kickbox
Report: "Verification result inconsistencies following Yahoo/AOL update"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are actively addressing an issue causing inconsistencies in the verification results for Yahoo and AOL emails. This problem has been traced to a breaking change in a recent update from Yahoo/AOL that affects our results. We have identified the issue and we are working towards a resolution.
Report: "Verification result inconsistencies following Yahoo/AOL update"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are actively addressing an issue causing inconsistencies in the verification results for Yahoo and AOL emails. This problem has been traced to a breaking change in a recent update from Yahoo/AOL that affects our results.We have identified the issue and we are working towards a resolution.
Report: "EU ONLY - System Outage"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
Currently investigating an issue that is causing an outage on our EU instances.
Report: "Partial Outage for Account Sign-in and some API"
Last updateIncident resolved.
Fix implemented and results being monitored.
Issue currently under investigation.
Report: "API Service Degradation"
Last updateThis incident has been resolved.
A fix has been deployed. We are now monitoring services.
Issue identified. Preparing deployment for fix.
We are currently investigating and issue causing slower than normal API responses for a fraction of requests
Report: "Partial API Outage"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
Partial outage. Investigating
Report: "Minor Service Degradation"
Last update# August 21, 2024 Service Degradation Investigation **Date**: 08/21/2024 **Status**: Mitigated ### Summary A service degradation was detected by our monitoring services between 12:34 PST and 12:48 PST. We removed the impacted servers from rotation restoring full-service. We investigated and solved the issue before bringing the impacted servers back into production rotation. ### Impact API service degradation in 45-second bursts for a 12 minute period, beginning at 12:34 PST causing some requests to respond with a 502 ‘Bad Gateway’. The issue was limited to a small subset of our production servers, so fewer than 3% of global API requests during this time were affected. ### Root Cause\(s\) During a routine canary upstream service update, a small number of canary servers became unable to connect to an upstream internal service. Due to this problem, a fraction of requests to these servers caused a connection timeout, resulting in 502 Bad Gateway errors from our proxies.This problem was sporadic, so the proxies did not automatically remove these servers from rotation. ### Resolution Our internal monitoring systems detected this degradation and we worked to manually remove these servers from production rotation. This restored the API to full health. After our investigation, we resolved the communication issues with a full code re-deploy and service restart. We then ran a full suite of API health checks, confirmed the problem was resolved, and re-added the servers into production. ### Action Items We will perform a review of our internal monitoring systems to decrease the time to alert, and review our proxy health-check endpoints to see if they can be improved for partial service degradation. ## Timeline
Team is investigating the cause of a minor service degradation occurring between 2:30pm and 3:00pm (central time). Details to come.
Report: "ESP Integrations"
Last updateThis incident has been resolved.
We’re currently experiencing reduced performance with several of our integrations, including Iterable, HubSpot, and Klaviyo within Kickbox, specifically affecting functionality; our team is actively working to resolve these issues to restore full service. Users can still actively leverage our email verification services through list verification or API. If you have any additional questions, please reach out to help@kickbox.com.
Report: "Partial API Outage"
Last updateA 7 minute unexpected partial outage occurred. Root cause still under investigation.
Report: "EU Verification Outage"
Last updateA networking error caused a severed connection to the EU verification engine. Manual reconnection has restored services.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating an issue with the EU verification instance
Report: "Web App Sign-in"
Last updateA DNS change with an Amazon service temporarily broke sign-in for some users Sunday morning (API unaffected.
Report: "Unplanned Outage - API and WebApp"
Last updateA minor configuration change \(adding an alias to a Redis instance\) caused an unexpected error and prevented the Kickbox app from restarting after deployment. The issue was quickly detected and resolved by the engineering team. The total outage was 11 minutes. The misconfiguration was identified quickly and a rollback tag was deployed to restore service within minutes.
There was a small outage 10 minute outage due to an unforeseen network issue during a routine deployment.
Report: "Degraded service - api and app"
Last updateWe had degraded service due to a partial datastore service degradation. We repaired and restarted the datastore. No data was lost.
We're investigating a period of degraded service beginning at 5:54AM CST and lasting until 7:30AM CST.
Report: "Partial Service Outage for some EU users"
Last updateAn issue with a upstream service provider created a window of unplanned downtime for a small fraction of EU users (smaller than 1%). The partial service outage lasted from 3pm September 19th until about 3pm the next day.
Report: "DMARC deliverablity tools suite"
Last updateThis incident has been resolved.
Still investigating the case.
In the deliverablity tools suite, DMARC data has stopped populating as of September 1st. Engineering is investigating the issue and we hope to have more information and a resolution to this issue soon.
Report: "DMARC reporting system."
Last updateWe’ve corrected the issue with DMARC that was preventing ongoing data collection. Ongoing, inbound DMARC reports should be properly received, parsed, logged and displayed in reporting. If you have any questions, please don’t hesitate to reach out to us at help@kickbox.com
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
Our system failed to properly log and summarized inbound aggregate DMARC reports. Our engineers are aware of the issue and are actively working to address this issue as soon as possible.
Report: "Partial API Outage"
Last updateThe new SSL cert that was implemented did not support `TLS1.2` . This resulted in issues for only some users where that support was required.
Just after 11:00 am (central) a routine TLS certificate update caused a misconfiguration. Some users experienced a partial outage.
Report: "Web App Outage"
Last updateAll services are now working normally. Cached data in redis does not appear to have been properly cleared. This resulted in a scenario where these appliances ran out of memory. Engineering strategies are currently being developed to ensure a similar issue does not occur in the future.
Services have been restored and engineers will begin monitoring duties.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating an issue where some users cannot sign into the web application. Currently making adjustments that may prevent web app usage for all users.
Report: "List Uploads Failing"
Last updateList uploading has resumed normal operation.
Our upload provider, Filestack, is experiencing an outage. As such, list uploads are failing. Filestack's outage status can be tracked here: https://status.filestack.com/incidents/hcyc09btgtxj. We will keep this status page updated accordingly.
We are investigating an issue with an upstream provider
Report: "API and List Verification Delays"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are investigating API and list verification delays
Report: "Intermittent API Responses"
Last updateAPI and list verification has resumed normal operation. We will continue to investigate root cause and monitor for future performance issues.
We are continuing to investigate this issue.
We're investigating intermittent API responses and list verification issues.
Report: "DNS Resolution Errors"
Last updateThis incident has been resolved. More information can be found via our upstream provider: https://www.cloudflarestatus.com/incidents/b888fyhbygb8
DNS resolution has returned to normal. We are continuing to monitor the situation
We're investigating DNS resolution issues from our DNS provider.
Report: "Problem with upstream certificate provider"
Last updateAPI has been operating normally since the creation of this issue. A permanent fix is in place for the upstream certificate provider. Marking as resolved.
An upstream certificate provider is experiencing issues. We've routed traffic away from that provider until the issue is resolved.
Report: "Elevated error rates with verification API"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Issues with Auto-Recharge Payments"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We're investigating an issue where users are receiving failed auto-recharge notification messages. It's safe to ignore this notification. We will update you when this issue is resolved.
Report: "Website Outage"
Last updateThis incident has been resolved.
Website has returned to normal operational status.
Kickbox's marketing site (https://kickbox.com) is currently down due to an outage from an upstream provider. All other systems including the management interface and API remain operational.
Report: "Kickbox Application Outage"
Last updateThis issue has now been resolved.
Access to the Kickbox application was affected by Cloudflare's outage. Processing email lists, API requests, and billing were unaffected.
Report: "Increased Verification API Error Rates"
Last updateThis issue has been resolved.
API has been operating normally since 12:25 UTC. We're continuing to monitor.
We are seeing increased error rates from the verification API
Report: "System outage"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We've determined that the API is also being affected by this outage and are continuing to resolve the issue.
We've identified the issue causing the outage and systems have been partially restored. There may be continued delays and outages while we resolve the issue.
We are continuing to investigate this issue.
We are currently investigating a system outage that is affecting our website and application, including email list verification.
Report: "Network Outage"
Last updateAll systems have resumed normal operation.
We are investigating a network outage.
Report: "System Outage"
Last updateServices restored.
We are currently undergoing an unscheduled outage.
Report: "Temporary Outage"
Last updateKickbox has concluded the unscheduled maintenance.
Kickbox is conducting unscheduled maintenance on the database server. Service will resume momentarily.
Report: "Slow List Verification"
Last updateList verification performance has returned to normal.
List processing and stats have resumed normal operation.
We have identified the issue and are working towards a resolution. Jobs are mostly processing normally. However, stats may not be updating for some users' list verifications.
Some users are experiencing slow and/or restarting list verifications. Engineers are working to identify the root cause of the delay.
Report: "Investigating upstream network performance degredation"
Last updateAll systems have resumed normal operation
Issue appears to be related to an upstream provider, Cloudflare. More @ https://www.cloudflarestatus.com/
We are currently investigating this issue.
Report: "Delayed List Verification"
Last updateThis incident has been resolved.
We are currently experiencing a large queue of list verifications. Some jobs may take several minutes to start.
Report: "Investigating a Network Outage"
Last updateWe have completed the migration to an unaffected network. All systems should now be operating normally.
We are rerouting traffic to another network to circumvent network issues with an upstream provider
Network operations have been restored. We're waiting on a post-mortem from the upstream network provider.
We are currently investigating this issue.
Report: "Emergency Maintenance"
Last updateMaintenance has been completed.
Kickbox is conducting emergency maintenance and will be back online shortly.
Report: "Slow response times during some requests in the Management Interface"
Last updateThe management interface should now be operating normally
We're currently experiencing some slow response times for certain requests in the management interface. This is not affecting the API or list verification. We have identified the cause and will be resolving shortly.
Report: "Delayed or Halted List Verification"
Last updateList Verification should now be operating normally.
We are investigating an issue that is causing list verifications to become delayed or halt entirely. The API is not affected.
Report: "Emergency Maintenance"
Last updateEmergency maintenance has been completed and list verification is functioning normally.
We are currently performing emergency maintenance that may impact list verification performance.
Report: "Packet Loss"
Last updateOur network provider has confirmed the issue is resolved.
Traffic flow has returned to normal. We're continuing to monitor the situation.
We are investigating occasional packet loss with our upstream network providers that may be affecting some API users.
Report: "Intermittent Network Latency"
Last updateAPI and Interface have resumed normal operation
Some users are experiencing slower than usual response times from the API and user interface.
Report: "Network Outage"
Last updateOur upstream network provider issued the following post mortem: A configuration change to one of our BGP peers caused the upstream sessions to restart. The problem has been resolved and corrected so it will not occur again.
Operations have returned to normal at this time. We are continuing to monitor the situation.
We are investigating a network outage affecting the API and Management Interface.
Report: "Increase in response times and unknown verification results"
Last updateVerification response times and unknown result rates have returned to normal
We are currently experiencing greater than average response times and an increase in unknown verification results. The issue has been identified and we are working towards a resolution.
Report: "Scheduled Maintenance"
Last updateMaintenance complete! Some Recipient Authentication messages were delayed during the maintenance. Those message will be relayed shortly.
We are currently undergoing a scheduled maintenance. Web UI will be temporarily inaccessible. Verification API should remain unaffected throughout this maintenance.
Report: "Slow Verification Response"
Last updateAll verifications have been operating normally since our previous update. Marking this incident as resolved.
Verifications have returned to normal parameters. We are continuing to monitor the situation as we further our root cause analysis. We will update with further information at that time. In the meantime, we are re-enabling list verifications.
Verification API is reporting close-to-normal response times now. We are still working to identify the root issue. Verification jobs have been paused until the issue is identified.
We are investigating slow/failing verification responses
Report: "Errors with upstream provider"
Last updateThe upstream provider issues have been resolved and Kickbox services have returned to normal.
Amazon S3 is experiencing an outage that is affecting Kickbox's list verification and integration services. We are currently monitoring the situation. Verification jobs may experience delays in processing.
Report: "Some delays in email delivery for Recipient Authentication messages"
Last updateDeliveries have returned to normal. Any delayed messages should arrive shortly.
Our upstream provider is reporting some email delivery delays which is affecting Recipient Authentication message delivery. More info here: https://status.sparkpost.com/incidents/v53x9z1rb7jh
Report: "Outage from Upstream Provider Causing Delays"
Last updateAll systems have resumed normal operation.
All systems are resuming normal operation. We will continue to monitor the situation.
An outage from an upstream compliance provider is causing delays in account creation, account management, billing, and list verification. We are posting an update to temporarily circumvent our reliance on this provider.
Report: "Intermittent issues with list upload / download"
Last updateThis issue is resolved and the service is operating normally.
List upload and download are operating normally. We will continue to monitor the situation.
We are seeing increased error rates from Amazon's S3 service which is causing some delays and issues with list upload and download.
Report: "Diminished Website and API Service"
Last updateThis issue has been resolved.
An issue was identified with top level .io nameservers that resulted in DNS resolution problems for several .io domains across the internet. The issue has been resolved by the .io registrar, nic.io, but some customers may continue to experience issues reaching kickbox.io or using the Kickbox API. This problem may continue until the bad resolution expires from cache.
The downstream provider is aware of the issue and is currently working towards a resolution.
We are investigating an issue with a downstream provider that may be causing problems accessing kickbox.io and the Kickbox API.