ChargeOver

Is ChargeOver Down Right Now? Check if there is a current outage ongoing.

ChargeOver is currently Operational

Last checked from ChargeOver's official status page

Historical record of incidents for ChargeOver

Report: "ChargeOver inaccessible"

Last update
identified

We are working with our primary data center to resolve the issue. At this time, we believe this to be a carrier networking issue at the data center itself, outside of ChargeOver's infrastructure.

Report: "System inaccessible"

Last update
investigating

We are currently investigating this issue.

Report: "Some emails are stuck in a SENDING... state"

Last update
resolved

We have identified an issue where some emails sent the evening of March 10 / the morning of March 11 appear to be stuck in a SENDING state, instead of being delivered. We have identified and resolved an issue where an internal network attached storage device became disconnected, causing some emails to get stuck in a SENDING... state, instead of actually being delivered. We are still investigating root cause further.

Report: "Delayed syncs to QuickBooks Online, Xero, Salesforce"

Last update
resolved

This issue has been resolved. All data syncs to Xero, QuickBooks Online, and Salesforce have caught up and fully synced.

monitoring

Data syncs to QuickBooks Online, Xero, and Salesforce are catching up quickly, and are almost fully caught up. Data will still be synced to QuickBooks Online, Xero, and Salesforce automatically, but the sync to those systems may be delayed by a few more hours. Manual syncing of data will not be required. We anticipate syncing being fully caught up within the next ~8 hours.

monitoring

Data syncs to QuickBooks Online, Xero, and Salesforce are catching up quickly, but there is a significant number of items to sync before everything has fully caught up. Data will still be synced to QuickBooks Online, Xero, and Salesforce automatically, but the sync to those systems may be delayed by several hours. Manual syncing of data will not be required. We are continuing to monitor the situation.

identified

We have identified an issue causing delayed syncs of data to QuickBooks Online, Xero, and Salesforce for a small subset of accounts. Data will still be synced to QuickBooks Online, Xero, and Salesforce automatically, but the sync to those systems may be delayed by several hours. Manual syncing of data will not be required. We are working on a resolution.

Report: "Delayed delivery of some emails"

Last update
postmortem

## Incident details From July 25th 14:00 PST to July 29 20:41 PST, some emails sent via ChargeOver were delayed. Only emails sent via ChargeOver’s email provider were affected \(emails sent via ChargeOver’s custom SMTP, custom SendGrid, Mailgun, and Mandrill integrations were _not_ impacted.\) ## Root cause On two separate days in July, malicious users logged into two separate ChargeOver accounts, and used them to send a large number of spam/scam emails. The ChargeOver accounts were _customers of ChargeOver_ - no Both ChargeOver accounts used extremely easy to guess passwords, used those passwords across many other applications beyond ChargeOver, and did not have 2FA/MFA enabled within ChargeOver. Malicious actors were able to guess the ChargeOver users' passwords to log in to ChargeOver. _ChargeOver itself was not hacked and did not suffer any sort of security breach._ This led to ChargeOver’s primary email provider \(SendGrid\) temporarily placing a hold on some outgoing email from ChargeOver. ChargeOver worked closely with SendGrid to restore normal email delivery. ## Incident timeline * Early July - two malicious actors guess ChargeOver user passwords \(or re-use passwords from other applications\), and send many malicious emails via ChargeOver * July 12 - 06:22 CT - ChargeOver’s monitoring automatically alerts our team of unexpected activity on two ChargeOver accounts, and we notify the affected customers and force password changes on the affected accounts * July 13 - 08:25 CT - ChargeOver begins work towards future mitigation strategies * \(July 13 - July 29\) - ChargeOver pushed out 19 separate updates through this 2-week period to mitigate impact, protect against future attacks, and migrate some critical pieces of infrastructure away from the affected SendGrid account\) * July 25 - 16:00 CT - SendGrid places a suspension/hold on one of ChargeOver’s email accounts, but does not notify us * July 26 - 11:55 CT - ChargeOver staff reach out to SendGrid because we are seeing some email being delayed * July 29 - 22:41 CT - SendGrid removes the hold/suspension on the account, and normal email delivery returns ## Remediation plan We’ve identified a number of items to be addressed to protect against future attacks, and mitigate impact. **The most important item is to encourage all ChargeOver customers to** [**enable 2FA/MFA on their ChargeOver account**](https://help.chargeover.com/docs/getting-started/configuration/two-factor-authentication-2fa/)**.** Enabling 2FA/MFA is the quick, easy, free, and the single most important thing anyone can to do protect against any sort of unauthorized access to any application you use online. ChargeOver customers will see a push for 2FA/MFA adoption across all ChargeOver accounts. Items already accomplished during the July 13 - July 29th period: * Improved / more proactive monitoring of spam reports * Added rate limiting for sending email through the ChargeOver admin panel * Move critical email infrastructure to separate, independent SendGrid accounts Future plans: * Enforce rate limiting for email sending through the REST API * Enforce rate limiting for email sending through the ChargeOver.js API * Add various rate limits into a number of in-app endpoints

resolved

Email delivery is operating normally. A postmortem will follow.

monitoring

The email delays have been resolved. We are monitoring delivery of emails. Some users may still experience some delays and/or higher-than-normal bounce rates for a short amount of time. Further updates and a postmortem will follow.

identified

We have diverted all email delivery to a secondary email account. Some delivery delays are still occurring. While we are working to resolve this, some emails sent may not be sent with correctly aligned DKIM policies/signatures. We continue to work with our email provider to resolve this.

identified

We are diverting some email delivery to a secondary email account. Some delivery delays are still occurring. While we are working to resolve this, some emails sent may not be sent with correctly aligned DKIM policies/signatures.

identified

Emails of the following types are being delivered without any delays: * invites sent to new admin users * admin password resets * scheduled reports * any sort of admin notifications (e.g. notifications when a quote is accepted, when custom domains are configured, etc.) We are aware that many of our customers are seeing significant delivery delays for transactional emails (e.g. invoice due emails, payment receipt emails, etc.). We are working with our email provider to get the delays resolved as soon as possible. If your account is using your own SMTP server or your own SendGrid, Mailgun, or Mandrill account, your account is unaffected by the delivery delays.

identified

We have identified the problem, and are working with our email provider to resolve it. At this time, we expect all emails to be sent successfully, but some delivery delays may occur.

identified

We have identified the problem, and are working with our email provider to resolve it. At this time, we expect all emails to be sent successfully, but some delivery delays may occur.

investigating

Some customers are experiencing a delay in delivery of outgoing emails from ChargeOver. Our team is working with our email provider to investigate the delay. At this time, we expect all emails to be sent successfully, but some delivery delays may occur.

Report: "Some customers are having trouble accessing ChargeOver"

Last update
postmortem

## Incident details On November 12th, some customers had trouble accessing ChargeOver. This was a networking issue at the primary data center ChargeOver is hosted within. ## Root cause On 12 November at 10:26 CT, ChargeOver began receiving automated reports of connectivity issues. The issue was tracked down to a networking issue at the data center ChargeOver’s primary location is within \(Flexential in Chaska, MN\). The problem was immediately escalated to Flexential engineers who identified an internet circuit at Flexential’s Chicago POP as the source of the issue.  Flexential opened a ticket with the circuit provider who confirmed there was an issue with packets over a certain size traversing the device due to an MTU misconfiguration. The circuit provider confirmed that they fixed the issue, however Flexential kept production traffic off the link while more investigation is performed.  Total impact to service was around 41 minutes. ## Incident timeline * Nov 12 - 10:26am CT - ChargeOver’s automated uptime monitoring identifies and alerts our engineering team of an issue * Nov 12 - 10:53am CT - ChargeOver identifies the issue as a bandwidth/networking issue outside of the ChargeOver network, and escalates the issue to Flexential engineering * Nov 12 - 11:05am CT - Flexential resolves the issue ## Remediation plan ChargeOver is working with our data center provider Flexential to improve monitoring and connectivity.

resolved

All connectivity issues are resolved now. Our primary data center (Flexential - Chaska) experienced a networking issue which impacted the ability to connect to ChargeOver services for some customers.

monitoring

All ChargeOver services are operating normally, and are accessible to everyone now. A networking or connectivity incident occurred with our data center provider (Flexential) which prevented some customers from accessing ChargeOver. We are still waiting for full details, and then will post an update. We continue to monitor connectivity. We believe any impact is now resolved, and everyone should be able to access ChargeOver at this time. An update will follow with more details.

investigating

All ChargeOver services are operating normally. Some customers are having trouble accessing ChargeOver. At this time, we believe this is due to a carrier network routing issue outside of ChargeOver. We are still investigating and working with our data center providers to get more information.

investigating

We are receiving reports that some customers are having trouble accessing the ChargeOver app. We are investigating further. All ChargeOver services appear to be operating normally, but it looks like some web traffic may be having problems accessing ChargeOver services.

Report: "Investigating issues with ChargeOver"

Last update
postmortem

## Incident details On November 18th, ChargeOver suffered a partial service outage. Many customers were unable to access most parts of the ChargeOver application. ## Root cause ChargeOver has a cluster of redundant ingress servers which accept all application traffic, and then route that traffic internally to different backend services. This cluster of ingress servers \(HAProxy servers\) is backed by a distributed network file store \(GlusterFS\), which stores data \(encrypted SSL/TLS certificates\) needed by HAProxy to serve secure connections for incoming HTTP requests. Sometime between November 7th and November 17th, the connection between two of our ingress load balancers and our distributed network file store was disconnected. We are still investigating the cause of the disconnection. On November 18th, we deployed a routine update to our ingress load balancers. The deployment failed on 2 of the 3 redundant ingress load balancers due to the lost connection to the distributed network file store. This caused approximately 2/3 of traffic to ChargeOver to be dropped. The third ingress load balancer stayed active and continued to serve traffic. Automatic DNS fail-over occurred, automatically removing the 2 affected ingress load balancers from serving traffic within 5 minutes of the outage. ChargeOver staff were notified immediately. We restarted the two affected ingress load balancers to re-establish the distributed network file store connection, and re-deployed the ingress load balancer update. This restored access to all services. ## Incident timeline * Sometime between November 7th and November 17th - connection between load balancers and distributed network file store is lost * November 18 - 3:29pm CT - Our team starts a routine deployment to our ingress load balancers * November 18 - 3:33pm CT - The deployment completes, and we are immediately notified of a problem with 2 of the 3 load balancers * November 18 - 3:38pm CT - Automatic DNS fail-over removes 2 of our 3 ingress load balancers from DNS, as expected * November 18 - 3:44pm CT - We post an initial update to [https://status.chargeover.com/](https://status.chargeover.com/) of the partial outage * November 18 - 3:55pm CT - After assessing the situation, we reboot the two problematic load balancers * November 18 - 3:59pm CT - Re-deploy of the ingress update for load balancers succeeds across all 3 servers * November 18 - 4:02pm CT - Our team confirms everything looks good, and DNS fail-over automatically adds the 2 failed servers back to DNS * November 18 - 4:07pm CT - Update posted to [https://status.chargeover.com](https://status.chargeover.com)/ indicating the issue has been resolved ## Remediation plan We’ve identified a number of take-aways from this incident. * We are implementing additional monitoring to proactively monitor the connection between the ingress load balancers and the distributed network file store. * We are discussing possible solutions to reduce/remove the dependency on the distributed file store from the load balancers. * There were several internal DNS names which pointed at specific ingress load balancers, rather than using the round-robin, more fault-tolerant DNS which our public services use. This hampered our ability to recover from this situation more quickly, so we are moving those internal DNS names to use round-robin DNS.

resolved

This incident has been resolved. A post-mortem containing further details to come.

identified

We've identified the problem and are implementing a fix.

investigating

We are aware of ongoing issues with access to our platform. We are currently investigating and will release details here as they become available.

Report: "Delayed processing of search / sync to 3rd-party platforms / email sending"

Last update
resolved

This issue has been resolved. A postmortem will follow.

monitoring

We are continuing to monitor to ensure that all 3rd-party integration syncs catch up.

monitoring

3rd-party integrations sync is continuing to catch up, and now impacting only a small percentage of users. We are continuing to monitor the sync.

identified

We have identified the cause of the degraded performance, and are working towards a resolution. Search and email sending is performing normally now. Sync to 3rd-party integrations is catching up, but still delayed.

investigating

ChargeOver is experiencing delays when syncing data to 3rd-party platforms (Salesforce, QuickBooks, Xero, etc.) and some delays with search and email sending. We are investigating the delays.

Report: "ChargeOver main application is unavailable"

Last update
postmortem

## Incident details The ChargeOver team follows an agile software development lifecycle for rolling out new features and updates. We routinely roll out new features and updates to the platform multiple times per week. Our typical continuous integration/continuous deployment roll-outs look something like this: * Developers make changes * Code review by other developers * Automated security, linting, and security testing is performed * Senior-level developer code review before deploying features to production * Automated deploy of new updates to production environment * If an error occurs, roll back the changes to a previously known-good configuration ChargeOver uses `Docker` containers and a redundant set of `Docker Swarm` nodes for production deployments. On `July 19th at 10:24am CT` we reviewed and deployed an update which, although passing all automated acceptance tests, errantly deployed an update which caused the deployment of `Docker` container images to silently fail. This caused the application to become unavailable, and a `503 Service Unavailable` error message was shown to all users. The deployment appeared to be successful to automated systems, but due to a syntax error actually only removed existing application servers rather than replacing them with the new software version. No automated roll back occurred, because the deployment appeared successful but had actually failed silently. ## Root cause A single extra space \(a single errant spacebar press!\) was accidentally added to the very beginning of a `docker-compose` `YAML` file, which caused the `docker-compose` file to be invalid `YAML` syntax. The single-space change was subtle enough to be missed when reviewing the code change. All automated tests passed, because the automated tests do not use the production `docker-compose` deployment configuration file. When deploying the service to `Docker Swarm`, `Docker Swarm` interpreted the invalid syntax in the `YAML` file as an empty set of services to deploy, rather than a set of valid application services to be deployed. This caused the deployment to look successful \(it successfully deployed, removing all existing application servers, and replacing them with nothing\) and thus automated roll-back to a known-good set of services did not happen. ## Incident timeline * 10:21am CT - Change was reviewed and merged from a staging branch, to our production branch. * 10:24am CT - Change was deployed to production, immediately causing an outage. * 10:29am CT - Our team posted a status update here, notifying affected customers. * 10:36am CT - Our team identified the related errant change, and started to revert to a known-good set of services. * 11:06am CT - All services became operational again after deploying to last known good configuration. At this time, all services were restored and operational. * 11:09am CT - Our team identified exactly what was wrong - an accidentally added single space character at the beginning of a configuration file, causing the file to be invalid `YAML` syntax. * 11:56am CT - Our team made a change to validate the syntax of the `YAML` configuration files, to ensure this failure scenario cannot happen again. ## Remediation plan There are several things that our team has identified as part of a remediation plan: * We have already deployed multiple checks to ensure that invalid `YAML` syntax and/or configuration errors cannot pass automated tests, and thus cannot reach testing/UAT or production environments. * Our team will work to improve the very generic `503 Service Unavailable` message that customers received, directing affected customers to [https://status.chargeover.com](https://status.chargeover.com) where they can see real-time updates regarding any system outages. * Customers logging in via [https://app.chargeover.com](https://app.chargeover.com) received generic `The credentials that you provided are not correct.` messages, instead of a notification of the outage. This will be improved. * Our team will do a review of our deployment pipelines, to see if we can identify any other similar potential failure points.

resolved

The incident has been resolved. A postmortem will be provided.

monitoring

All systems have been restored. We are monitoring the fix. A postmortem will be posted.

identified

We are continuing to work on a fix for this issue.

identified

We have identified the problem. ETA to resolution is less than 30 minutes.

investigating

We are aware of the problem, and are investigating. We will post further updates as we have them.

Report: "In-app pages are not loading correctly; trouble logging in"

Last update
resolved

This issue has been resolved. Please contact us if you continue to have any trouble.

monitoring

A fix has been implemented, and we are monitoring to make sure all services have recovered.

identified

We have identified an issue causing trouble logging in, and causing some pages to load incorrectly. ETA to fix is less than an hour.

Report: "Some users are unable to log in from https://app.chargeover.com"

Last update
resolved

The login issue has been resolved.

identified

We have identified the login issue, and are working on a fix. You can continue to log in at https://(your-company-name).chargeover.com/admin/ while we work towards a fix. This issue does NOT affect any scheduled processes. Invoices were created, and payments were run as scheduled, without problem.

investigating

Some users are unable to log in at https://app.chargeover.com/ If you are having trouble, you can log in at your account-specific URL immediately instead. For example, log in at https://(your-company-name-here).chargeover.com/admin/ We are investigating, and will provide further updates soon.

Report: "DDoS"

Last update
resolved

ChargeOver experienced a DDoS (distributed denial of service) attack from a malicious actor using a large number of IP addresses and Digital Ocean ( https://www.digitalocean.com/ ) servers. An extremely large number of HTTP requests (hundreds of thousands of requests in just a few minutes) were directed at our servers in a very short amount of time, and this degraded the performance of ChargeOver's systems enough to make ChargeOver unavailable or slow for some users, for approximately 30 minutes. ChargeOver staff were immediately alerted of the attack, and responded by deploying firewall rules to block the attack. No intrusions made, no vulnerabilities were exploited, no data loss occurred, and scheduled payment processing was unaffected. ChargeOver is already in process of deploying additional DDoS mitigation tools into our network to mitigate against any future DDoS attacks.

Report: "Processing Delays"

Last update
resolved

At this time all scheduled payments have been processed, emails sent, and the platform is operating normally. We are investigating root cause. If you continue to experience any issues, please contact our support team via email at support@ChargeOver.com

monitoring

A fix has been implemented and we are monitoring the results.

identified

ChargeOver is experiencing a delay in processing which has affected our automated payments/emails and in-app searching. We have identified the problem and are working to resolve the matter. Thank you for your patience.

Report: "Delay in processing payments, syncing data to external services (QuickBooks, Salesforce, etc.)"

Last update
postmortem

## Root cause * A hardware failure in ChargeOver’s storage system caused severely degraded performance. * Automatic failover to redundant systems did not occur, which resulted in ChargeOver being completely unavailable for approximately 2 hours. ## Timeline * 06:10 CT on Jan 1 2022 - ChargeOver staff are alerted to abnormally slow invoice generation / payment processing times, and start investigating immediately. * 07:27 CT on Jan 1 2022 - Data storage for company logos is identified as causing performance impacts. * 08:16 CT on Jan 1 2022 - A hardware component of ChargeOver’s data storage system fails. Despite hardware redundancy being in place, the failure impacted other components and stops invoice generation / payment processing. Other components of ChargeOver remain operational. * 11:30 CT on Jan 1 2022 - Further hardware failures when trying to recover from the initial failure. Fail-over to the in-place redundant systems does not occur automatically, which causes ChargeOver to become entirely unavailable. * 13:15 CT on Jan 1 2022 - Access to most ChargeOver systems is restored. * 13:30 CT on Jan 1 2022 - ChargeOver staff immediately begin work on ensuring that invoice generation delays / payment processing delays are remediated. * 00:15 CT on Jan 2 2022 - ChargeOver staff notify affected customers of the need to check / update logos on their invoices. * Continued work - ChargeOver staff continue to check to ensure any data needing sync to third-party systems does get synced. ## Impact The following systems were known to be impacted by this failure: * **Some company logos may be unavailable - all ChargeOver customers should check to ensure their company logo is showing up correctly on their invoices / payment receipts.** * **ChargeOver, and all related components \(website, app, REST API, sync to external systems, customer portal, etc.\) was unavailable from approximately 11:30 CT to 13:30 CT on Jan 1 2022.** * Approximately 40% of ChargeOver customers experienced a delay in generating invoices. * Approximately 40% of ChargeOver customers experience a delay in processing payments. * Syncing of data to third-party integrations \(QuickBooks, Xero, Salesforce, etc.\) was delayed \(and in some limited cases, the sync to these systems may still be delayed as of Jan 3\). * Some emails \(invoice due notices, payment receipts\) may have failed to send. * Some webhooks may have failed to send. * Dunning / retry processes may not have run on Jan 1 and Jan 2 for a small subset of ChargeOver customers. ## Future planning ChargeOver recognizes the impact to our customers that this incident has caused. At this time we are assessing what needs to change to make the storage systems that failed more tolerant of failures, so that this sort of incident cannot be repeated.

resolved

We continue to monitor and will provide a postmortem. All systems are operational.

monitoring

All ChargeOver systems are operational at this time. Invoices for Jan 1, 2022 have been generated, and payments for Jan 1, 2022 have been processed. A small percentage of payments for a small subset of ChargeOver customers have been deferred for processing until tomorrow (Jan 2, 2022). These deferred payments will process automatically. Some data may not have synced to third-party integrations (e.g. QuickBooks, Salesforce, Xero) yet. We are working on ensuring that everything syncs as quickly as possible. ChargeOver customers should watch for an email regarding company logos, as you may be required to verify the company logo you uploaded into ChargeOver is up-to-date. We are continuing to monitor all systems and will continue to post updates, and a postmortem.

identified

Any invoices due to generate have generated. Delayed payments continue to processing and catch up. Thanks for your patience.

identified

Invoices that needed to be generated today and were delayed are generating now, and should be caught up shortly. Payments that were delayed should begin running shortly. Further updates will follow.

identified

We are continuing to work on a fix for this issue.

identified

Some services are restored. We are still working on restoring all functionality/services. We will provide further updates soon.

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on resolving this issue. ChargeOver may be unavailable for some customers.

identified

ChargeOver is experiencing a delay in processing payments, and syncing data to some external services, for some ChargeOver accounts. Our team is investigating the delays and will provide further updates soon.

Report: "Delays in syncing data to QuickBooks, Salesforce, Xero, etc."

Last update
resolved

This has been resolved.

monitoring

All syncing delays should be resolved now. We are monitoring to make sure all data has synced.

identified

ChargeOver is experiencing delays syncing some data to QuickBooks / Xero / Salesforce / some other integrations. Data WILL sync, but will be delayed. We are are working on a resolution.

Report: "Service Outage"

Last update
postmortem

ChargeOver primarily uses the MariaDB database for data storage. The MariaDB process crashed on our primary database server at 11:09 CST. The process crashed with the following error message which is still being investigated: `InnoDB: Failing assertion: templ->clust_rec_field_no != ULINT_UNDEFINED` ChargeOver staff were immediately notified, and we began to investigate the issue. The database was automatically restarted immediately, and the database began to run automated data integrity checks to ensure that no data was lost, and no data was corrupted before beginning to service requests again. Although we had the ability to fail-over to a secondary database server, a decision was made to let the process complete, as the estimated downtime was very short. This automated data integrity check took much longer than originally expected. Our estimated time to recovery was less than an hour, and instead the database server took approximately 3 hours to do integrity checks and restart safely. The data integrity checks took from 11:09 CST to 14:49 CST. After the data checks were complete, our team ran through our recovery checklist, and started the database server. Service was restored in a degraded state at 15:01 CST, and fully operational at 15:15 CST. We recognize that there are many things to be learned from this lesson, and are working towards putting pieces in place to be able to avoid the long check times and revise our fail-over processes in the future to better account for possible long check times. Please make sure to subscribe to updates at [https://status.ChargeOver.com](https://status.ChargeOver.com) to be notified of service disruptions in the future.

resolved

This issue has been resolved and all services are operational. Root cause and postmortem will follow.

monitoring

All ChargeOver services are operational now. We continue to monitor the situation. A postmortem will follow.

monitoring

We continue to monitor the situation as services are being restored.

identified

Services are being restored. We are continuing to monitor the situation and will provide further updates.

identified

We are continuing to work towards restoration of service. More updates to follow.

identified

We have identified the issue, and are working to resolve the outage as quickly as possible.

investigating

We are aware of an issue, and are investigating. Further information to follow.

Report: "First Data / Payeezy is experiencing a widespread network outage"

Last update
resolved

First Data has informed us that the problem is resolved. First Data has not provided a root cause or any other information about the issue at this time.

monitoring

The ChargeOver team is continuing to monitor this situation. First Data has not yet provided an ETA or status update. They are aware of the issue and are working to resolve it. As we get more information from First Data, we will provide further updates.

monitoring

We continue to monitor this situation. First Data has not responded to our team with an ETA or status update at this time, but we are seeing much better connectivity and much improved processing with First Data now. As we get more information from First Data, we will provide further updates.

monitoring

First Data (Payeezy, BluePay, and a backend to many other payment processors) is currently experiencing a network outage. Merchants using Payeezy as a payment processor may see sporadic gateway timeouts / connection errors / other errors until First Data resolves this issue. First Data was unable to give us an ETA at this time, but as we get more information from them, we will provide updates.

Report: "Invoice generation, sending, and payment processing is delayed for some accounts"

Last update
postmortem

**What Happened / Customer Impact** Our certificate authority issued ChargeOver a TLS/SSL certificate with a CA chain certificate which expired before our actual TLS/SSL certificate did. This caused automatic invoice generation and payment processing to be delayed for some customers. **Technical Details** ChargeOver secures internal and external connections with TLS/SSL certificates. The company ChargeOver purchases certificates from issued us an SSL/TLS certificate with a chain file which expired before our actual certificate. So even though our SSL/TLS certificate was valid and _not_ expired, a certificate in the chain needed to validate the certificate expired on May 30th. Most web browsers were unaffected, so access to the [ChargeOver.com](http://ChargeOver.com) website and app were unaffected. However, common libraries and tools like cURL and wget began rejecting connections due to the expired chain certificate. ChargeOver's scheduled invoice generation and scheduled payment processes depend on the cURL library, and so the system was unable to trigger invoice generation and payment processing for some accounts scheduled to generate invoices after the certificate had expired. Our monitoring picked up the issue, and alerted our engineering team. ChargeOver then removed the expired certificate from the chain, and invoices and payments began processing normally again. This caused a delay of invoice/payment processing of between 30 minutes and 4 hours for some ChargeOver accounts. **Ongoing Efforts** We are working on some additional validation to ensure that our CA cannot issue us certificates with a certificate chain that expires prior to the certificate we are being issued.

resolved

This issue has been resolved. Postmortem to follow.

monitoring

We continue to monitor things, as things catch up -- invoices are generating, emails are sending, and payments are processing.

monitoring

We have resolved the issue, and services are being restored. Invoices are generating, emails are sending, and payments are processing as normal now. Some ChargeOver accounts may see delayed generation of invoices while things catch back up. A further update will follow.

investigating

Some ChargeOver accounts are experiencing delayed invoice generation, invoice sending, and automated payment processing. We are investigating.

Report: "Partial outage due to internal routing failure"

Last update
resolved

Some ChargeOver accounts were unavailable for less than 2 minutes due to an internal routing issue. The routing issue was related to an infrastructure-related upgrade deployment. Engineering identified the issue immediately after the deploy, and the issue was resolved in under 2 minutes. Engineering has taken steps to ensure that this class of errors can no longer occur on future routing-system upgrades.

Report: "Performance degraded / partial outage"

Last update
resolved

This incident has been resolved, and all services are functioning as expected. We are further investigating the root cause of this, but are not seeing any other issues at this time.

monitoring

The system has stabilized and we are not seeing any additional connection issues at this time. We are still seeing some slow page loads and delayed payment processing times, and are investigating further.

investigating

We are investigating issues connecting to ChargeOver.

Report: "Trouble logging in from www.ChargeOver.com"

Last update
resolved

The issues with the log in page have been resolved.

identified

We have identified a problem with logging in from app.ChargeOver.com, and are working on a fix. ETA is approximately 15 minutes. If you log in from https://(your ChargeOver domain here).ChargeOver.com/admin/ you will be able to log in without problems. Please contact support at support@ChargeOver.com if you need assistance. Only logins from https://app.ChargeOver.com/login are problematic.

Report: "UPS failure / power outage"

Last update
resolved

Our primary data center (511 Building in Minneapolis, MN) experienced a full-building power outage. Although the building has backup generators and full-building UPS power, the UPS fail to pick up the power outage when the power to the building was cut. ChargeOver staff were alerted immediately, as was the facility staff. The 511 Building is now taking bids to install a parallel (backup) UPS to prevent this kind of issue in the future. All ChargeOver services are back online, and we have confirmed that no data was lost. Any scheduled tasks or events have been processed. Total downtime was approximately 2 hours.

Report: "Degraded performance"

Last update
resolved

ChargeOver accounts experienced approximately 1 to 2 minutes of degraded performance due to an internal reporting job that generated an unusually large amount of SQL server traffic. A small # of HTTP request timeouts were seen as our load balancers/proxy servers re-routed traffic. Our engineering team is making adjustments to the reporting components to ensure this issue does not resurface.

Report: "Upstream carrier networking issue"

Last update
resolved

One of our datacenter providers confirmed the routing issues are resolved.

identified

The issue has been identified and a fix is being implemented. Our upstream carrier is estimating a 10 minute repair time.

Report: "Database Connectivity Issue"

Last update
resolved

This issue has been resolved. We will provide further updates soon.

identified

Our team has identified some database connectivity issues between hosts in our primary datacenter. This is causing outages to some ChargeOver instances. We will update with more information here.

Report: "Uptime reporting change"

Last update
resolved

We changed a header response on our monitoring page which affected the way our 3rd party monitoring provider check service availability. There was no actual outage at this time.