Zaptec

Is Zaptec Down Right Now? Discover if there is an ongoing service outage.

Zaptec is currently Operational

Last checked from Zaptec's official status page

Historical record of incidents for Zaptec

Report: "WebAPI / Zaptec Portal"

Last update
monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently experiencing some issues with Zaptec Portal/WebAPI. This is being investigated.

Report: "Zaptec Portal unresponsive"

Last update
monitoring

A fix has been implemented and we are monitoring the results.

investigating

Parts of the Zaptec Portal are currently not functioning as expected.

Report: "We are experiencing latency in our API."

Last update
investigating

We are currently investigating this issue.

Report: "OCPP Disconnections"

Last update
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Investigating

We are currently seeing drops in OCPP connections. We are investigating this issue.

Report: "OCPP Disconnections"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently seeing drops in OCPP connections. We are investigating this issue.

Report: "Charging station network disconnections"

Last update
Resolved

Connections have been stable for a couple of hours now. This incident have been resolved.

Monitoring

Connections have stabilized at a normal level. We will keep monitor the situation.

Update

Connections are already ramping up. We are still investigating what might have caused it.

Investigating

We are currently seeing a drop in connections for our charging stations. This seems to be an external issue, but we are investigating the issue.

Report: "Charging station network disconnections"

Last update
resolved

Connections have been stable for a couple of hours now. This incident have been resolved.

monitoring

Connections have stabilized at a normal level. We will keep monitor the situation.

investigating

Connections are already ramping up. We are still investigating what might have caused it.

investigating

We are currently seeing a drop in connections for our charging stations. This seems to be an external issue, but we are investigating the issue.

Report: "OCPP Outage"

Last update
resolved

On 11th of May around 03:15 (CET), the OCPP Bridge Proxy service, supporting the OCPP Bridge service, experienced a transient network issue causing parts of it to restart. The restarted instances had problems communicating with the backend service, resulting in approximately 25% of devices disconnecting across the board. As no direct root cause could be identified at the time, and with a 75% stable connection rate, it was decided to perform a full restart of the OCPP Bridge Proxy service at 08:18 (CET). This resolved the issue with the backend service, and we observed a steady recovery in connected devices, returning to 100% within 30 minutes. During this incident, there were issues with the internal alerting process responsible for notifying the team to update the status page. As a result, no status was created during the incident. This gap have been identified and addressed.

Report: "OCPP disconnections/restarts"

Last update
postmortem

## Summary An intermittent and hard-to-diagnose issue impacted the OCPP Bridge, causing delays and failures in charging operations. The incident spanned multiple days with various troubleshooting efforts including scaling, infrastructure and code changes. A temporary resolution was achieved on March 20, but problems resurfaced the next day, leading to the decision to migrate to a new Azure Service Bus instance. The incident was ultimately traced back to high CPU usage and server saturation on Microsoft’s physical infrastructure hosting the original Service Bus. Something not visible in our metrics and ultimately out of hands. ## Impact During the incident, 3rd party integrator and charging stations connected to our ocpp bridge experienced problems with sending and receiving observations, resulting in a delay in start/stop charging, and in some cases being unable to start charging. ## Root Cause Analysis On 18th of March around 11PM we detected a few brief disconnect in our OCPP Bridge but quickly stabilize. As the night continue we see and increase in the frequency and duration of the drops. In the following morning we establish a dedicated team to focus on troubleshooting and fixing the issue. The system behavior we observe turns out to be very difficult to troubleshoot since there is no clear cause as to where the problem originate or what can be the cause. Throughout the day we implement a number of possible fixes to try and pinpoint the root problem, and focus heavily on getting better insight by adding more metrics and logs, as well as changing log levels to be more verbose. Due to the intermittent behavior of the problem it is difficult to know what hotfixes might have a positive or negative effect, and by the end of the day we have only been able to stabilize the backend for shorter duration but are no closer to finding the root cause. Nothing in our code, firmware or infrastructure suggest that we should be having problems. On the 20th around 2:30PM we see a “blip” on the primary Service Bus where the service is not reporting any metrics back to use, giving us reason to believe that Microsoft is making updates or changes to the backend of our specific Service Bus. After the “blip” all systems are back to normal without us having made any new changes to our code or infrastructure. To try and confirm that our systems are back and operational; we cycle some key consumers of the Service Bus by doing a full restart. We go into active monitoring until the next day when we start to see identical problems at around 9:50PM. At this point we decide to migrate our applications to use a new, but identical Service Bus as we are now as sure as we can be that the problem is on Microsoft’s end and not Zaptec. What made this particularly difficult to troubleshoot was the fact that our observability in the form of metrics and logs indicated no resource saturation. Neither in our code or infrastructure. After having escalated the support request to Microsoft we received confirmation that they did in fact have problems with the underlying servers that was hosting our infrastructure, but this was not communicated to us during the active incident. ## Action Items & Follow-Up Even though the root cause turned out to be on Microsoft sides, we have identified areas that we can improve to try and prevent these types of problems in the future. Azure Service Bus is a PaaS, meaning Microsoft handles the infrastructure, scaling, and maintenance in the operating, network and physical server stack. We now know that we can’t always trust the metrics being sent to us. The only way we can try to eliminate these types of problems in a troubleshooting scenario is to deploy new PaaS infrastructure and migrate our services. This is now being added earlier in our troubleshooting plan, and we will make this process quicker and easier by prioritizing the deployment in our Backup and Disaster Recovery plan. This is being followed up by the Platform team. In addition to finding improvements to our current topology, we are exploring options to simplify our infrastructure and rely less on Service Bus or similar message brokers. This is an ongoing initiative involving Platform, developers and architect.

resolved

This incident has been resolved, postmortem will be posted when ready.

monitoring

We are still monitoring the situation.

monitoring

There is still full functionality. You may experience some minor delays, but all operations should work as expected. We are still monitoring the situation.

monitoring

Full functionality have been enabled to OCPP Bridge, and systems are running fine. You may experience some minor delays, but all operations should work as expected. We are still monitoring the situation. Thank you for your patience.

investigating

After the rollback at 10:00, we do see improvements, but we are still operating with limited functionality. With this, all charging stations that are connected, will work as expected. But, if there are done any changes on installation level - updates from installations, circuits, and chargers, OCPP messages will not be received. This includes changes such as change to authentication mode, and the addition or removal of installations, circuits, or chargers from OCPP. We are still working on a solution to this incident.

investigating

After enabling full functionality at 09:15, we do see some issues. We will roll back to limited functionality at 10:00 and keep investigating the issue.

monitoring

As our systems have been running as expected since last update, we are going to enable full functionality to OCPP at 09.15.

monitoring

We are seeing improvements to OCPP bridge, and our systems are stable again after our latest deployment. However, OCPP have been deployed with limited functionality. With this, all charging stations that are connected, will work as expected. But, if there are done any changes on installation level - updates from installations, circuits, and chargers, OCPP messages will not be received. This includes changes such as change to authentication mode, and the addition or removal of installations, circuits, or chargers from OCPP. We are actively monitoring the situation, working on a solution, and expect full functionality to be restored by tomorrow morning, with further improvements ongoing.

investigating

We still don't know the root cause, and will continue the investigation through the evening. Thank you for your patience.

investigating

Our development team is fully focused on identifying the root cause and working towards a resolution. While we are still investigating and do not yet have a definitive explanation, please rest assured that we have our best people on the case. We will provide an update as soon as we have more clarity on the situation. Thank you for your patience and understanding.

investigating

It is still unclear what's causing the issues. We are continuing to investigate.

investigating

At this stage, we are still investigating the root cause of the issue and don’t have a definitive explanation yet. We will provide an update as soon as we have more clarity.

investigating

We will continue to investigate this issue today.

investigating

We will keep investigating this issue over the night.

investigating

We are continuing to investigate this issue.

investigating

We can see improvements on our systems as of now. But we are still investigating the issue.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

More logging have been activated to track down the issue. We will continue to investigate.

investigating

We are still on top of this issue, and we will continue to investigate. As of now we have no ETA on when we expect the problem to be fixed, but we will update frequently.

investigating

We are still investigating this issue.

investigating

We are continuing to investigate this issue.

investigating

At 13:00 we will do a reboot of the OCPP Proxy. You will see chargers disconnect from the OCPP backend, but should recover shortly after.

investigating

We are continuing to investigate this issue.

investigating

We are still investigating this issue.

investigating

We are continuing to investigate this issue.

investigating

We are seeing an increase of BootNotifications being sent from our charging stations without any clear reason. We are currently investigating this issue.

Report: "Zaptec Portal/App slow/unresponsive"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results. You may still experience some delay.

identified

The issue has been identified and a fix is being implemented.

investigating

The Zaptec Portal is unresponsive, and the Zaptec app is running slowly. We are actively investigating the issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Zaptec Portal/App slow/unresponsive"

Last update
resolved

This incident has been resolved. Zaptec portal and App should work as expected.

investigating

We are still investigating this issue.

investigating

We're currently experiencing delays in the Zaptec Portal and App. We are currently investigating the issue.

Report: "WebAPI / Zaptec Portal"

Last update
postmortem

During the rollout of new features system got unstable due to internal misuse of the legacy API, that had been left as a safety net for the compatibility. Legacy mode had been disabled for all users and system started to behave normally under the load.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Zaptec Portal slow/unresponsive"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Zaptec Portal slow/unresponsive"

Last update
postmortem

`Due to increased load on one of the critical endpoints from a third party, we experienced saturation on the database level. At February the 17th 12:56 CET, we saw degradation of the API related to reads (OCPP and load balancing was not affected). We identified the root cause and introduced mitigating actions on the offending third party. At 14:55, the system stabilised and at 19:12, the third party had been rate limited on the problematic endpoint. We know the problematic part of the critical endpoint and will plan to address it accordingly.`

resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Sense APM functionality could be impacted"

Last update
postmortem

On Friday 14th of February at 12:35 CET the cloud-based Sense service, which provides Automatic Power Management \(APM\) functionality to Zaptec installations, began running below normal capacity, leading to a disruption of functionality. The degradation was identified at 12:37 CET. Our engineers started investigating logs and metrics for the service. We identified the root cause as an overlooked manual configuration change. This was restored to original state and the service was again operational at 13:55 CET. We have updated our monitoring to identify similar problems much faster in the future, and are reviewing our processes to prevent manual configurations from persisting in our production environment.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Zaptec Portal unresponsive"

Last update
postmortem

On the maintenance event, infrastructure health check started to fail and a specific part of the API could not recover. After disabling health check and forcefully route traffic to the application, the service recovered. We will look into liveness probe to see how it can be improved.

resolved

Zaptec Portal may be slow or unresponsive. Root cause identified.

Report: "Delays in processing charging events"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

You might see that the Zaptec App/Portal takes longer time than expected to update. There might also be delays when authorizing a charging session with app or RFID tag. If you experience this, you can activate Stand Alone mode on your charging station through Zaptec App or Portal. We are investigating the issue.

Report: "API unresponsive"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue. However, our systems are improving.

investigating

We are currently experiencing some issues with our API. This is being investigated.

Report: "Zaptec Portal performance issues"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate what caused this, but the Zaptec Portal is back to normal performance.

investigating

We are currently experiencing performance issues in Zaptec Portal.

Report: "Performance issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are seeing disconnects in the Zaptec Cloud. We are currently investigating the issue.

Report: "Zaptec Portal performance issues"

Last update
resolved

This incident has been resolved.

investigating

We are receiving reports that the Portal is running slow. We are currently investigating this issue.

Report: "Delays in processing charging events"

Last update
postmortem

Before the incident report team was in session at around 22:00 CEST, the systems had started to experience an ever-increasing latency, leading to users potentially experiencing delays when trying to control their charging stations. When the incident report team was set into session, the latency had reached 5 minutes or more for most, if not all. In the hours that followed, the team worked through numerous hypotheses and made several attempts to reduce the load, primarily by limiting retries and restarting applications. Despite these efforts, the root cause of the issue has not yet been fully identified, and we are still actively investigating it. Various code changes were prepared for deployment, aimed at improving several calls believed to function like multicast processes, such as duplicate occurrences of balancing etc. Unfortunately, these attempts were not deemed successful in reducing or eliminating the issue. However, the problem resolved itself at 00:00 CEST, as a natural dip in traffic pulled the overall load under an unknown threshold, stabilizing the system. At around 01:00 CEST, the response team deemed the system to have recovered, and further improvement attempts were paused. The situation was monitored for the next half hour, ending the response team’s efforts at 01:30 CEST. In the following days, the team will prioritize improving observability, redesigning traffic flows, and addressing how our systems and surrounding consumer services handle traffic. While the situation has stabilized, we are committed to identifying the root cause and ensuring long-term stability.

resolved

The incident has been resolved. We will continue to monitor this internally. Post mortem further detailing the incident will be published.

investigating

We are continuing to investigate this issue.

investigating

We are still seeing delays in processing charging events, and a incident response team will keep investigating the issue during the night. You might see that the Zaptec App/Portal takes longer time than expected to update. There might also be delays when authorizing a charging session with app or RFID tag. If you experience this, you can activate Stand Alone mode on your charging session through Zaptec App or Portal. Next update is expected tomorrow morning (16.10.2024) at 08.00. we apologize for the inconvenience.

investigating

We are continuing to investigate this issue.

investigating

We're seeing delays in processing charging events. We are investigating the issue.

Report: "Delays in processing charging events and disconnections"

Last update
resolved

This incident has been resolved.

investigating

Services appear to have stabilized, but we will continue to investigate throughout the night. Next update will be no later than 07:00 tomorrow morning (18.10.2024).

investigating

We still are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Delays in processing charging events and disconnections"

Last update
resolved

This incident has been resolved.

monitoring

All systems are operational and back to normal service. We will continue monitoring.

investigating

We are continuing to investigate the issue. We have not ETA as of now.

investigating

We're again seeing limited problems regarding network connectivity, and some delays in processing charging events. We are investigating the issue.

monitoring

All systems have been back to normal since ~1AM. We will keep monitoring, and spend the day figuring out what caused it from happening.

investigating

We're seeing limited problems regarding network connectivity, and some delays in processing charging events. We are investigating the issue.

Report: "Drop in LTE-M connections Denmark"

Last update
resolved

LTE-M connections have been re-established.

investigating

Charging stations appears to have gone offline in Denmark due to an outage at a cellular provider. This has also caused the OCPP connections to drop on some charging stations. This is currently out of our control, but we are actively monitoring the situation.

Report: "Eco mode issues"

Last update
resolved

A fix have been implemented, and Eco mode is working as expected again.

identified

The issue have been identified, and we are working on a fix.

monitoring

If you have Eco mode active on your Zaptec Go charging station, you might see the charging start at an unexpected time. The issue have been identified, and we are working on a fix.

Report: "Drop in LTE-M connections Denmark"

Last update
resolved

This incident has been resolved.

monitoring

Charging stations appears to have gone offline in Denmark due to an outage at a cellular provider. This has also caused the OCPP connections to drop on some charging stations. This is currently out of our control, but we are actively monitoring the situation.

Report: "Delays in processing charging events and OCPP disconnections"

Last update
postmortem

We experienced slowdowns from Azure IoT Hub, which had a cascading effect in our messaging service leading messages to pile-up for several minutes. The slowdown itself was originated in Azure, but things can be done from our end to avoid issues like this. We’ve changed some messages to not require synchronous behaviour \(i.e. they don’t need to wait for messages ahead to be processed\) and we have decreased the timeout value that messages will wait before giving up \(thus making the pile-up less likely in another slowdown\). In the longer term we’re working an a multiple IoT Hub solution, which should reduce the likelihood of one Azure slowdown affecting all chargers.

resolved

This incident has been resolved.

monitoring

We will continue to monitor the situation over the weekend, to make sure all systems stay operational.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

All systems are still running as they should. We will monitor the situation until tomorrow morning, and keep investigating to find the root cause. However, we do have a suspicion of what might have caused this to happen.

investigating

We are still investigating this issue.

investigating

We are continuing to investigate this issue.

investigating

OCPP connections are remaining stable, and the processing of charger events are back to normal. We are currently investigating what caused this to happen, as well as avoiding it to happen again.

monitoring

We are continuing to monitor for any further issues.

monitoring

OCPP connections are stabilizing, and the processing of charging events are improving. We will keep on monitoring the situation until it's all back to normal.

investigating

We are continuing to investigate this issue.

investigating

We are still investigating this issue.

investigating

We are currently investigating this issue.

Report: "WebAPI, Portal and the Zaptec - High failure rates"

Last update
resolved

We have addressed the problem and things are now stabile.

investigating

We are continuing to investigate this issue. No new info as of yet.

investigating

We are currently investigating this issue.

Report: "Drop in LTE-M connections"

Last update
resolved

This incident has been resolved.

monitoring

Charging stations appears to have gone offline in Denmark due to an outage at a cellular provider. This has also caused the OCPP connections to drop on some charging stations. This is currently out of our control, but we are actively monitoring the situation.

investigating

We have received reports that charging stations are loosing connectivity when connected to LTE-M. We are currently investigating this issue.

Report: "Drop in LTE-M connections"

Last update
resolved

This incident has been resolved.

identified

Charging stations appears to have gone offline due to a major outage at our cellular provider, Telenor. This has also caused the OCPP connections to drop on some charging stations. This is currently out of our control, but we are actively monitoring the situation. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

identified

We have identified the root cause of the issue. Charging stations appears to have gone offline due to a major outage at our cellular provider, Telenor. This has caused the OCPP connections to drop. This is currently out of our control, but we are actively monitoring the situation. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

investigating

Charging stations appears to have gone offline due to a major outage at our cellular provider, Telenor. This has caused the OCPP connections to drop. This is currently out of our control, but we are actively monitoring the situation. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

investigating

We are seeing drops of OCPP connections. We are currently investigating this issue. This may cause problems with authorization and starting to charge for some users. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

Report: "OCPP disconnections"

Last update
resolved

This incident has been resolved.

investigating

Things are back to normal, but we will keep investigating over the next few days.

investigating

Charging stations are still reconnecting. We are still investigating.

investigating

OCPP Bridge will be restarted in ~5 minutes. Expect full disconnection. The reconnection rate should be improved immediately after.

investigating

We are seeing drops of OCPP connections. We are currently investigating this issue. This may cause problems with authorization and starting to charge for some users. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

Report: "OCPP disconnections"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are seeing drops of OCPP connections. We are currently investigating this issue. This may cause problems with authorization and starting to charge for some users. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

Report: "Chargers appearing offline"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are receiving reports that some chargers are appearing offline, even though they are online. We are currently investigating this issue.

Report: "Registration issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix have now been implemented, and we are monitoring the results. If you are still missing the activation email from us, please attempt to register over again with the email you wanted to register.

investigating

We are receiving reports that users can't get registered through the Zaptec App or Zaptec Portal. We are currently investigating.

Report: "Drops of OCPP connections"

Last update
resolved

This incident has been resolved.

identified

We have identified the root cause of the issue. Charging stations have gone offline due to a major outage at our cellular provider, Telenor. This has caused the OCPP connections to drop. This is currently out of our control, but we are actively monitoring the situation. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

investigating

We are seeing drops of OCPP connections. We are currently investigating this issue. This may cause problems with authorization and starting to charge for some users. What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

Report: "OCPP disconnections"

Last update
resolved

This incident has been resolved.

monitoring

OCPP connections have now been recovered. We are now monitoring the results.

identified

We are still working on restoring the OCPP connections.

identified

Earlier today we deployed an update to our API. After a short time it was noticed that this carried an issue, and we decided to roll back the API to a previous version. Unfortunately and accidentally, the rollback was done to a version of the API that carried the issue seen a couple of weeks ago, which caused the issue to repeat. The API has now been reverted to the correct version (i.e. the one running the morning, before today’s deployment). This makes the API endpoints behave as normal, without leading to data loss. Unfortunately some data might have been corrupted if the endpoints were used between 14:00 and 15:20 CEST. We are working to restore this now using the same procedure as the one used last time. Since we know exactly what happened and how to address the situation, we expect a faster turnaround on this problem.

identified

There have been a data corruption incident today that led some customers' installations to lose the URL specifying connectivity to their OCPP central system. We are currently working on restoring this data. If you have the ability of re-setting the OCPP URL, this will also re-establish the OCPP connection.

Report: "OCPP disconnections"

Last update
postmortem

At 14:06 CEST a deployment was made to our API that included work done on two features. This work included an inadvertent change of behaviour to several endpoints used to control configuration of installations and devices: * `POST /charger` * `PUT /charger/{id}` * `POST /installation` * `PUT /installation/{id}` * `POST /installation/{id}/update` With this change, when requesting one of these endpoints, and if the JSON body was missing any of the parameters regarding OCPP Bridge configuration \(central system URL parameter, default tag ID, or password\), the system would then delete those missing parameters from the device/installation configuration. This differs from the previous behaviour, where a missing parameter would leave the configured value intact.Since many customers interact with these endpoints to change the behaviour of their installations and devices, this caused their OCPP Bridge configuration to be wiped if it was previously set. Unfortunately, given that those endpoints are updated very frequently in certain scenarios \(like setting available current on an installation\), this has led to several thousand devices losing their OCPP connection.The issue was quickly uncovered by a partner starting to have severe issues, since they call one of the endpoints very frequently for every installation that they manage — which caused OCPP settings to be wiped. Once aware, Global Support quickly brought the issue to the attention of developers.Once the behaviour was noticed, it was quickly correlated to the recent deployment, and a quick analysis revealed the root cause. At 15:54 CEST, the API was reverted to the previous state, and the endpoint behaviour returned to normal. However, the data for OCPP Bridge configuration had already been wiped for many installations at this point. An incident response team was set up at 16:54 CEST and it immediately started working on strategies to recover the corrupted data. A point-in-time restore of the database was initiated to obtain a copy of the data that could be used to fix affected installations. The team also managed to identify that all the corruption had been tracked in our Change Log for every installation. This redirected efforts to locating all the damaged data within the Change Log both on Chargers and Installations, allowing us to identify affected installations, as well as individual charging stations. The change log was retrieved, and the team’s effort focused on creating a script to restore the data. The behaviour of the script was verified in stages, and eventually applied to all installations, restoring the data. Data for the individual charging stations was corrected manually by Global Support.Improvements identified * We need unit tests for all API endpoints that accept partial data, ensuring that existing values are kept. * Apply more tooling for using the Change Log, especially for data recovery. * Our database’s point-in-time restore was slower than expected. Timeline \(all times CEST\) * 14:06: Deployment completed * Prior to 15:50: Issue discovered and researched * 15:54: Rollback of deployment to prevent further damage * 16:39: Triggered restore of the database to a separate copy retrieve the list of the installations before the corruption * 16:53 Incident response team formed * 18:11: Retrieving Change Log for all installations since 14:00 * 20:47: Manually restoring charger-level settings * 21:10: Restore limited installations as verification * 21:18: Restore the remaining installations * 21:57: Issue is deemed resolved

resolved

Restore is complete and the OCPP configuration data should be restored to all installations.

identified

We are still working on restoring the OCPP connections.

identified

We are still continuing to work at restoring the OCPP connections. If you have the ability of re-setting the OCPP URL, this will also re-establish the OCPP connection. How to do this, is well described in the first steps of this guide: https://help.zaptec.com/hc/en-001/articles/12771090871825-Connecting-your-Zaptec-Installation-to-an-OCPP-Server

identified

We are continuing to work at restoring the OCPP connections. Our aim is to restore this as soon as possible, but if you have the ability of re-setting the OCPP URL, this will re-establish the OCPP connection. How to do this, is well described in the first steps of this guide: https://help.zaptec.com/hc/en-001/articles/12771090871825-Connecting-your-Zaptec-Installation-to-an-OCPP-Server

identified

There was a data corruption incident today that led some customers’ installations to lose the URL specifying connectivity to their OCPP central system. This led the OCPP connections to be dropped. We are currently working on restoring this data.

Report: "OCPP / Charging issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are receiving reports that there are issues to start charging by using third party/payment solutions. We are currently investigating this issue. You might experience issues when trying to start charging. We apologize for the inconvenience this may cause.

Report: "Zaptec Sense not reporting data"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We have received reports that Zaptec Sense is not reporting data, and is displayed as offline in Zaptec Portal and App. We are currently investigating this issue.

Report: "OCPP / Charging issues"

Last update
resolved

This incident has been resolved.

investigating

We are still investigating to find the root cause of this issue. However, we've deployed some changes that should make authorization more stable. We will continue to monitor this over the weekend. If you still are experiencing any issues initiating a charge session, please try to authenticate over again through the App, or by scanning the RFID tag.

investigating

We're still investigating issues starting to charge when using using third-party or payment solutions. If you are experiencing issues with this, please try to authenticate over again through the App, or by scanning the RFID tag.

investigating

We are still investigating issues starting to charge when using using third-party or payment solutions. We do not have an ETA for a fix as of this moment, but we’re receiving reports that a decreasing number of customers are affected by the issue. If you are experiencing issues with this, please try to authenticate over again through the App, or by scanning the RFID tag.

investigating

We are still investigating issues starting to charge when using using third-party or payment solutions. We do not have an ETA for a fix as of this moment, but we're receiving reports that a decreasing number of customers are affected by the issue.

investigating

We've received and increased amount of reports of issues when starting to charge by using third-party or payment solutions. We are investigating this issue to find the root cause. We do not have an ETA for a fix as of this moment.

monitoring

We are continuing to monitor for any further issues.

monitoring

We have received reports of difficulties initiating charging through third-party payment solutions. We will continue to monitor the situation and apologise for any inconvenience this may cause.

investigating

We are receiving reports that there are issues to start charging by using third party/payment solutions. We are currently investigating this issue. You might experience issues when trying to start charging. We apologize for the inconvenience this may cause.

Report: "Delayed APM readings"

Last update
resolved

This incident has been resolved.

monitoring

The visual graph in Zaptec Portal might not display charging energy, but the APM is working properly and it will still adjust charging current. The issue have been identified and fix have been implemented. We are monitoring the current behaviour.

Report: "Missing OCPP-messages"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are still continuing to investigate this issue.

investigating

We have received reports of some OCPP messages not going through. We are investigating the behavior of our systems and will update as soon as we find a root cause of the issue. ---------------------------- What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

investigating

We are receiving reports of some OCPP-messages not going through. This may result in issues starting to charge your vehicle. We are currently investigating. ---------------------------- What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

Report: "Delayed charging start and OCPP messages"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

Charging should now work as expected. If you experience trouble when starting charging, we recommend unplugging the charging cable and then plugging it back again. This action will restart the charging session, and we experience that this method solves the problem in most cases. We continue monitoring our cloud very closely, and we will keep all our relevant resources working on improving our systems by updating our infrastructure and architecture for a more long-term solution. ----------- What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

monitoring

Charging should now work as expected. If you experience trouble when starting charging, we recommend unplugging the charging cable and then plugging it back again. This action will restart the charging session, and we experience that this method solves the problem in most cases. We continue monitoring our cloud very closely, and we will keep all our relevant resources working on improving our systems by updating our infrastructure and architecture for a more long-term solution. ----------- What is OCPP? OCPP is an open-source communication standard for EV charging stations and network software companies. Third party operators who uses Zaptec’s hardware to connect with their own payment solution, normally communicate through the OCPP standard.

monitoring

We are still monitoring this issue. Services are still stable and charging should work as expected. We recommend removing the charging cable and inserting it again for users who still fail to start charging. This action will restart the charging session and will in some cases solve the issue.

monitoring

We are still monitoring this issue. The temporary fixed has stabilized things during the night for most users and charging should work as expected. We recommend removing the charging cable and inserting it again for users who still fail to start the charging session. This action will restart the charging session.

investigating

A temporary fix has been deployed that seems to have stabilized things. We are still monitoring this. We recommend removing the charging cable and inserting it again for users who still fail to start the charging session. This action will restart the charging session. If you use the app or RFID card to identify yourself before starting charging, please do so as usual. If you do not want to wait for charging to start, there is an option to activate standalone on your charger. This will cause the charging station to stop communicating with our server, and charging will begin immediately. Please see this guide on how to enable standalone: https://help.zaptec.com/hc/en-001/articles/22655325267345-How-to-activate-Stand-Alone-mode#h_01HPXYJ1PDGYED2MTGHJ8PPBW6 We apologize for any inconvenience this may cause. Our top experts are on the case, and we are working around the clock to address the issue.

investigating

A temporary fix has been deployed that seems to have stabilized things. We are currently monitoring.

investigating

We are still experiencing instability on our server, resulting in a delay of up to 10 minutes in charging initiation. Users who wish to charge during this period can continue doing so by plugging in their charging cable and initiating charging as they normally would. Charging will begin, though it might take longer than usual. If you use the app or RFID card to identify yourself before starting charging, please do so as usual. If you do not want to wait for charging to start, there is an option to activate standalone on your charger. This will cause the charging station to stop communicating with our server, and charging will begin immediately. Please see this guide on how to enable standalone: https://help.zaptec.com/hc/en-001/articles/22655325267345-How-to-activate-Stand-Alone-mode#h_01HPXYJ1PDGYED2MTGHJ8PPBW6 We apologize for any inconvenience this may cause. Our top experts are on the case, and we are working around the clock to address the instability.

investigating

We are currently experiencing instability on our server, resulting in a delay of up to 10 minutes in charging initiation. Users who wish to charge during this period can continue doing so by plugging in their charging cable and initiating charging as they normally would. Charging will begin, though it might take longer than usual. If you use the app or RFID card to identify yourself before starting charging, please do so as usual. We apologize for any inconvenience this may cause. Our top experts are on the case, and we are working around the clock to address the instability.

investigating

Engineers are still working on identifying the root cause of the issue. Still no ETA.

investigating

Engineers keep working on identifying the issue. Still no ETA.

investigating

We are seeing problems again, our engineers are looking into it.

investigating

We're currently upgrading some infrastructure which may lead to some delays and unresponsiveness for a while.

investigating

Another fix have been implemented, and we can see even more improvements now. You may still experience some delays when authorizing and stopping charging sessions through an OCPP provider. We will continue to investigate

investigating

We can see some improvements after a fix was implemented, but incident is still ongoing. We will keep investigating, and update as soon as possible.

investigating

We are still continuing to investigate this issue. No ETA as of yet.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "API is unresponsive"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

A fix have been implemented, and we are monitoring the performance

investigating

Zaptec Portal and App are experiencing slowness. We are investigating the issue.

Report: "Zaptec API outage"

Last update
postmortem

> Between 03:50 and 07:25 we experienced an outage on our public API. This was caused by our monitoring system reacting incorrectly when all API services became simultaneously unavailable for a brief moment. This led to the whole class of services being classified as unavailable and unable to process requests, which in turn prevented the system from self-healing. Our engineers identified and corrected the problem, bringing the API back online. We also introduced a change that ensures at lease one resource is kept operational to enact self-healing in case of a general outage.

resolved

This incident has been resolved.

Report: "Zaptec.com is unresponsive"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Zaptec.com is unresponsive"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Portal issues."

Last update
postmortem

We deployed a change to permissions that contained a faulty query. Error deploy is rolled back.

resolved

This incident has been resolved.

investigating

We are experiencing some issues with loading data in the portal. We are investigating and will update soon.

Report: "Zaptec Portal perfomance issues"

Last update
resolved

We are going into a new phase with unscheduled maintenance.

investigating

We keep working on performance mitigation while working on more fundamental improvements. We expect more stability in our Portal in the coming weeks.

investigating

We continue with our mitigation and upgrade plans. Since we are changing some of our infrastructure, it takes time to validate changes and ensure that our systems remain operational. Over the next few days we expect to gradually bring these changes to our production environment, and we hope to see performance improvements during next week. In the meantime, we continue to observe sporadic high levels of demand that can cause slow responsiveness. If you experience a timeout, please retry the operation.

investigating

We are continuing to investigate the issue. In the meantime if operations time out, please retry.

investigating

We are receiving reports about issues with the portal. We are investigating the issue, and will update soon.

Report: "Chargers appearing offline"

Last update
resolved

While deploying a performance improvement to our systems, the process that detects online/offline status of chargers became unresponsive for a longer-than-expected period. This led to chargers temporarily being marked as “offline”. The change was reverted and the detection process was restarted, which made chargers return to online status. We plan to change the deployment process to avoid this behavior before we attempt a deployment again.

Report: "Slowness in user interfaces and API"

Last update
postmortem

On Thursday 7th September Zaptec APIs and user interfaces \(Portal and mobile apps\) started experiencing intermittent issues with performance, leading users to experience long periods to interact with our services. Zaptec engineers identified the cause as a cumulative effect of multiple systems dealing with increasing load from users and third parties, leading to cascading delays from system to system. For the following two days we addressed multiple bottlenecks in our database infrastructure that gradually alleviated the problem, while experiencing further increases in load that again caused intermittent slowdowns. From Saturday 9th our database optimizations stabilized system performance, which we have since been monitoring. We also continue work in substantial improvements to our data processing architecture, which we expect to put in place in the coming weeks — this will allow Zaptec to scale the capacity to handle increasing load, as our installed base continues to grow.

resolved

This incident has been resolved.

monitoring

A fix has been applied and we're currently monitoring our systems to confirm that performance returns to normal levels.

investigating

We are currently experiencing sporadic bursts of activity that cause a slowdown to our interfaces (Portal and App) and our API. Some operations can take several seconds to complete, and occasionally time out. If you experience a timeout, please wait some seconds and try the operation again. Our engineers are actively investigating and attempting to mitigate the issue.

investigating

We are continuing to investigate this issue. Currently, we do not have any new information.

investigating

We are continuing to investigate this issue.

investigating

We are receiving reports about issues with the portal. We are investigating the issue, and will update soon.

Report: "Some OCPP connected chargers disconnecting and reconnecting"

Last update
postmortem

Monday July 17th we received reports of some chargers connected to OCPP disconnecting and reconnecting again. As nothing had been changed at our end the previous few weeks, it was problematic to troubleshoot. It was identified as an issue with the OCPP bridge. Wednesday July 19th the issue was resolved. The instances of the OCPP bridge was on a node that was under heavy network usage. We we able to shuffle som workload around resulting in a stable service

resolved

This incident has been resolved.

investigating

We are getting reports of some chargers that is connected to OCPP, is getting disconnected and reconnecting again. We are investigating. We will update the status as soon as we have anything new to report.

Report: "Chargers disconnected from OCPP"

Last update
resolved

Some chargers connected to OCPP provider struggling with the connection. Issue resolved 13:50 local time. One of our OCPP instance was struggling, after we started up a new instance the problem was solved.

Report: "Some chargers connected to OCPP get disconnected"

Last update
resolved

Some chargers connected to OCPP provider struggling with the OCPP connection. Incident start: 01:59(UTC). Incident resolved: 08:50(UTC). One of our OCPP instances was struggling, after we started up a new instance the problem was solved.