Kickbox

Is Kickbox Down Right Now? Check if there is a current outage ongoing.

Kickbox is currently Operational

Last checked from Kickbox's official status page

Historical record of incidents for Kickbox

Report: "Verification result inconsistencies following Yahoo/AOL update"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are actively addressing an issue causing inconsistencies in the verification results for Yahoo and AOL emails. This problem has been traced to a breaking change in a recent update from Yahoo/AOL that affects our results. We have identified the issue and we are working towards a resolution.

Report: "Verification result inconsistencies following Yahoo/AOL update"

Last update
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Identified

We are actively addressing an issue causing inconsistencies in the verification results for Yahoo and AOL emails. This problem has been traced to a breaking change in a recent update from Yahoo/AOL that affects our results.We have identified the issue and we are working towards a resolution.

Report: "EU ONLY - System Outage"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

Currently investigating an issue that is causing an outage on our EU instances.

Report: "Partial Outage for Account Sign-in and some API"

Last update
resolved

Incident resolved.

monitoring

Fix implemented and results being monitored.

investigating

Issue currently under investigation.

Report: "API Service Degradation"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been deployed. We are now monitoring services.

identified

Issue identified. Preparing deployment for fix.

investigating

We are currently investigating and issue causing slower than normal API responses for a fraction of requests

Report: "Partial API Outage"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

Partial outage. Investigating

Report: "Minor Service Degradation"

Last update
postmortem

# August 21, 2024 Service Degradation Investigation **Date**: 08/21/2024 **Status**: Mitigated ### Summary A service degradation was detected by our monitoring services between 12:34 PST and 12:48 PST. We removed the impacted servers from rotation restoring full-service. We investigated and solved the issue before bringing the impacted servers back into production rotation. ### Impact API service degradation in 45-second bursts for a 12 minute period, beginning at 12:34 PST causing some requests to respond with a 502 ‘Bad Gateway’. The issue was limited to a small subset of our production servers, so fewer than 3% of global API requests during this time were affected. ### Root Cause\(s\) During a routine canary upstream service update, a small number of canary servers became unable to connect to an upstream internal service. Due to this problem, a fraction of requests to these servers caused a connection timeout, resulting in 502 Bad Gateway errors from our proxies.This problem was sporadic, so the proxies did not automatically remove these servers from rotation. ### Resolution Our internal monitoring systems detected this degradation and we worked to manually remove these servers from production rotation. This restored the API to full health. After our investigation, we resolved the communication issues with a full code re-deploy and service restart. We then ran a full suite of API health checks, confirmed the problem was resolved, and re-added the servers into production. ‌ ### Action Items We will perform a review of our internal monitoring systems to decrease the time to alert, and review our proxy health-check endpoints to see if they can be improved for partial service degradation. ‌ ## Timeline

resolved

Team is investigating the cause of a minor service degradation occurring between 2:30pm and 3:00pm (central time). Details to come.

Report: "ESP Integrations"

Last update
resolved

This incident has been resolved.

identified

We’re currently experiencing reduced performance with several of our integrations, including Iterable, HubSpot, and Klaviyo within Kickbox, specifically affecting functionality; our team is actively working to resolve these issues to restore full service. Users can still actively leverage our email verification services through list verification or API. If you have any additional questions, please reach out to help@kickbox.com.

Report: "Partial API Outage"

Last update
resolved

A 7 minute unexpected partial outage occurred. Root cause still under investigation.

Report: "EU Verification Outage"

Last update
resolved

A networking error caused a severed connection to the EU verification engine. Manual reconnection has restored services.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating an issue with the EU verification instance

Report: "Web App Sign-in"

Last update
resolved

A DNS change with an Amazon service temporarily broke sign-in for some users Sunday morning (API unaffected.

Report: "Unplanned Outage - API and WebApp"

Last update
postmortem

A minor configuration change \(adding an alias to a Redis instance\) caused an unexpected error and prevented the Kickbox app from restarting after deployment. The issue was quickly detected and resolved by the engineering team. The total outage was 11 minutes. The misconfiguration was identified quickly and a rollback tag was deployed to restore service within minutes.

resolved

There was a small outage 10 minute outage due to an unforeseen network issue during a routine deployment.

Report: "Degraded service - api and app"

Last update
resolved

We had degraded service due to a partial datastore service degradation. We repaired and restarted the datastore. No data was lost.

investigating

We're investigating a period of degraded service beginning at 5:54AM CST and lasting until 7:30AM CST.

Report: "Partial Service Outage for some EU users"

Last update
resolved

An issue with a upstream service provider created a window of unplanned downtime for a small fraction of EU users (smaller than 1%). The partial service outage lasted from 3pm September 19th until about 3pm the next day.

Report: "DMARC deliverablity tools suite"

Last update
resolved

This incident has been resolved.

investigating

Still investigating the case.

investigating

In the deliverablity tools suite, DMARC data has stopped populating as of September 1st. Engineering is investigating the issue and we hope to have more information and a resolution to this issue soon.

Report: "DMARC reporting system."

Last update
resolved

We’ve corrected the issue with DMARC that was preventing ongoing data collection. Ongoing, inbound DMARC reports should be properly received, parsed, logged and displayed in reporting. If you have any questions, please don’t hesitate to reach out to us at help@kickbox.com

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

Our system failed to properly log and summarized inbound aggregate DMARC reports. Our engineers are aware of the issue and are actively working to address this issue as soon as possible.

Report: "Partial API Outage"

Last update
postmortem

The new SSL cert that was implemented did not support `TLS1.2` . This resulted in issues for only some users where that support was required.

resolved

Just after 11:00 am (central) a routine TLS certificate update caused a misconfiguration. Some users experienced a partial outage.

Report: "Web App Outage"

Last update
resolved

All services are now working normally. Cached data in redis does not appear to have been properly cleared. This resulted in a scenario where these appliances ran out of memory. Engineering strategies are currently being developed to ensure a similar issue does not occur in the future.

monitoring

Services have been restored and engineers will begin monitoring duties.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating an issue where some users cannot sign into the web application. Currently making adjustments that may prevent web app usage for all users.

Report: "List Uploads Failing"

Last update
resolved

List uploading has resumed normal operation.

identified

Our upload provider, Filestack, is experiencing an outage. As such, list uploads are failing. Filestack's outage status can be tracked here: https://status.filestack.com/incidents/hcyc09btgtxj. We will keep this status page updated accordingly.

investigating

We are investigating an issue with an upstream provider

Report: "API and List Verification Delays"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating API and list verification delays

Report: "Intermittent API Responses"

Last update
resolved

API and list verification has resumed normal operation. We will continue to investigate root cause and monitor for future performance issues.

investigating

We are continuing to investigate this issue.

investigating

We're investigating intermittent API responses and list verification issues.

Report: "DNS Resolution Errors"

Last update
resolved

This incident has been resolved. More information can be found via our upstream provider: https://www.cloudflarestatus.com/incidents/b888fyhbygb8

monitoring

DNS resolution has returned to normal. We are continuing to monitor the situation

investigating

We're investigating DNS resolution issues from our DNS provider.

Report: "Problem with upstream certificate provider"

Last update
resolved

API has been operating normally since the creation of this issue. A permanent fix is in place for the upstream certificate provider. Marking as resolved.

investigating

An upstream certificate provider is experiencing issues. We've routed traffic away from that provider until the issue is resolved.

Report: "Elevated error rates with verification API"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Issues with Auto-Recharge Payments"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We're investigating an issue where users are receiving failed auto-recharge notification messages. It's safe to ignore this notification. We will update you when this issue is resolved.

Report: "Website Outage"

Last update
resolved

This incident has been resolved.

identified

Website has returned to normal operational status.

identified

Kickbox's marketing site (https://kickbox.com) is currently down due to an outage from an upstream provider. All other systems including the management interface and API remain operational.

Report: "Kickbox Application Outage"

Last update
resolved

This issue has now been resolved.

monitoring

Access to the Kickbox application was affected by Cloudflare's outage. Processing email lists, API requests, and billing were unaffected.

Report: "Increased Verification API Error Rates"

Last update
resolved

This issue has been resolved.

monitoring

API has been operating normally since 12:25 UTC. We're continuing to monitor.

investigating

We are seeing increased error rates from the verification API

Report: "System outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We've determined that the API is also being affected by this outage and are continuing to resolve the issue.

identified

We've identified the issue causing the outage and systems have been partially restored. There may be continued delays and outages while we resolve the issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating a system outage that is affecting our website and application, including email list verification.

Report: "Network Outage"

Last update
resolved

All systems have resumed normal operation.

investigating

We are investigating a network outage.

Report: "System Outage"

Last update
resolved

Services restored.

identified

We are currently undergoing an unscheduled outage.

Report: "Temporary Outage"

Last update
resolved

Kickbox has concluded the unscheduled maintenance.

identified

Kickbox is conducting unscheduled maintenance on the database server. Service will resume momentarily.

Report: "Slow List Verification"

Last update
resolved

List verification performance has returned to normal.

monitoring

List processing and stats have resumed normal operation.

identified

We have identified the issue and are working towards a resolution. Jobs are mostly processing normally. However, stats may not be updating for some users' list verifications.

investigating

Some users are experiencing slow and/or restarting list verifications. Engineers are working to identify the root cause of the delay.

Report: "Investigating upstream network performance degredation"

Last update
resolved

All systems have resumed normal operation

identified

Issue appears to be related to an upstream provider, Cloudflare. More @ https://www.cloudflarestatus.com/

investigating

We are currently investigating this issue.

Report: "Delayed List Verification"

Last update
resolved

This incident has been resolved.

identified

We are currently experiencing a large queue of list verifications. Some jobs may take several minutes to start.

Report: "Investigating a Network Outage"

Last update
resolved

We have completed the migration to an unaffected network. All systems should now be operating normally.

monitoring

We are rerouting traffic to another network to circumvent network issues with an upstream provider

monitoring

Network operations have been restored. We're waiting on a post-mortem from the upstream network provider.

investigating

We are currently investigating this issue.

Report: "Emergency Maintenance"

Last update
resolved

Maintenance has been completed.

identified

Kickbox is conducting emergency maintenance and will be back online shortly.

Report: "Slow response times during some requests in the Management Interface"

Last update
resolved

The management interface should now be operating normally

identified

We're currently experiencing some slow response times for certain requests in the management interface. This is not affecting the API or list verification. We have identified the cause and will be resolving shortly.

Report: "Delayed or Halted List Verification"

Last update
resolved

List Verification should now be operating normally.

investigating

We are investigating an issue that is causing list verifications to become delayed or halt entirely. The API is not affected.

Report: "Emergency Maintenance"

Last update
resolved

Emergency maintenance has been completed and list verification is functioning normally.

identified

We are currently performing emergency maintenance that may impact list verification performance.

Report: "Packet Loss"

Last update
resolved

Our network provider has confirmed the issue is resolved.

monitoring

Traffic flow has returned to normal. We're continuing to monitor the situation.

investigating

We are investigating occasional packet loss with our upstream network providers that may be affecting some API users.

Report: "Intermittent Network Latency"

Last update
resolved

API and Interface have resumed normal operation

investigating

Some users are experiencing slower than usual response times from the API and user interface.

Report: "Network Outage"

Last update
resolved

Our upstream network provider issued the following post mortem: A configuration change to one of our BGP peers caused the upstream sessions to restart. The problem has been resolved and corrected so it will not occur again.

monitoring

Operations have returned to normal at this time. We are continuing to monitor the situation.

investigating

We are investigating a network outage affecting the API and Management Interface.

Report: "Increase in response times and unknown verification results"

Last update
resolved

Verification response times and unknown result rates have returned to normal

identified

We are currently experiencing greater than average response times and an increase in unknown verification results. The issue has been identified and we are working towards a resolution.

Report: "Scheduled Maintenance"

Last update
resolved

Maintenance complete! Some Recipient Authentication messages were delayed during the maintenance. Those message will be relayed shortly.

identified

We are currently undergoing a scheduled maintenance. Web UI will be temporarily inaccessible. Verification API should remain unaffected throughout this maintenance.

Report: "Slow Verification Response"

Last update
resolved

All verifications have been operating normally since our previous update. Marking this incident as resolved.

monitoring

Verifications have returned to normal parameters. We are continuing to monitor the situation as we further our root cause analysis. We will update with further information at that time. In the meantime, we are re-enabling list verifications.

investigating

Verification API is reporting close-to-normal response times now. We are still working to identify the root issue. Verification jobs have been paused until the issue is identified.

investigating

We are investigating slow/failing verification responses

Report: "Errors with upstream provider"

Last update
resolved

The upstream provider issues have been resolved and Kickbox services have returned to normal.

identified

Amazon S3 is experiencing an outage that is affecting Kickbox's list verification and integration services. We are currently monitoring the situation. Verification jobs may experience delays in processing.

Report: "Some delays in email delivery for Recipient Authentication messages"

Last update
resolved

Deliveries have returned to normal. Any delayed messages should arrive shortly.

identified

Our upstream provider is reporting some email delivery delays which is affecting Recipient Authentication message delivery. More info here: https://status.sparkpost.com/incidents/v53x9z1rb7jh

Report: "Outage from Upstream Provider Causing Delays"

Last update
resolved

All systems have resumed normal operation.

monitoring

All systems are resuming normal operation. We will continue to monitor the situation.

identified

An outage from an upstream compliance provider is causing delays in account creation, account management, billing, and list verification. We are posting an update to temporarily circumvent our reliance on this provider.

Report: "Intermittent issues with list upload / download"

Last update
resolved

This issue is resolved and the service is operating normally.

monitoring

List upload and download are operating normally. We will continue to monitor the situation.

identified

We are seeing increased error rates from Amazon's S3 service which is causing some delays and issues with list upload and download.

Report: "Diminished Website and API Service"

Last update
resolved

This issue has been resolved.

monitoring

An issue was identified with top level .io nameservers that resulted in DNS resolution problems for several .io domains across the internet. The issue has been resolved by the .io registrar, nic.io, but some customers may continue to experience issues reaching kickbox.io or using the Kickbox API. This problem may continue until the bad resolution expires from cache.

identified

The downstream provider is aware of the issue and is currently working towards a resolution.

investigating

We are investigating an issue with a downstream provider that may be causing problems accessing kickbox.io and the Kickbox API.