iAdvize (HA)

Is iAdvize (HA) Down Right Now? Check if there is a current outage ongoing.

iAdvize (HA) is currently Operational

Last checked from iAdvize (HA)'s official status page

Historical record of incidents for iAdvize (HA)

Report: "P2 - stats indexing problem"

Last update
Resolved

The issue is resolved

Update

We are continuing to monitor for any further issues.

Monitoring

All stats are now displayed.We are monitoring now.

Update

We are performing new actions to mitigate the issue.The root cause is partially found.

Update

We are still performing actions to resolve the stats indexation issue.

Investigating

We are facing an issue related to the stats.Some of them won't update for now.We are currently working on this.

Report: "P2 - stats indexing problem"

Last update
resolved

The issue is resolved

monitoring

We are continuing to monitor for any further issues.

monitoring

All stats are now displayed. We are monitoring now.

investigating

We are performing new actions to mitigate the issue. The root cause is partially found.

investigating

We are still performing actions to resolve the stats indexation issue.

investigating

We are facing an issue related to the stats. Some of them won't update for now. We are currently working on this.

Report: "P1 - Copilots are not responding"

Last update
resolved

This incident has been resolved.

monitoring

We found the root cause of the issue. Yet, a fix is live. All is working normally. We keep monitoring the production.

investigating

We are having a production issue right now, the bots are not able to reply to visitors, we are currently investigating the cause

Report: "P1 - Copilots are not responding"

Last update
Resolved

This incident has been resolved.

Monitoring

We found the root cause of the issue.Yet, a fix is live. All is working normally.We keep monitoring the production.

Investigating

We are having a production issue right now, the bots are not able to reply to visitors, we are currently investigating the cause

Report: "P1 - Impossiblity for Visitors and Agents to start a video conversation"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being fixed.

investigating

Please know that the video stream conversations between Visitors and Agents are out of order. We are investigating the issue.

Report: "P1 - Impossibility for visitors to create a conversation"

Last update
resolved

A brief 9-minute incident (from 5p.m to 5:09p.m CET) impacted conversation creation, temporarily preventing visitors from reaching agents. The issue was promptly identified and resolved, and services are now fully operational. We are monitoring the situation to ensure continued stability.

Report: "P2 - Latency in Copilot's response time"

Last update
resolved

The latency issues have been resolved. Copilot's response time is returning to normal, since 11:54am (CET). Latency duration from 11:10am to 11:54am (CET). Thanks for your understanding.

monitoring

The latency issues have been resolved. Copilot's response time is returning to normal, since 11:54am (CET).

investigating

We are currently experiencing latency in Copilot's response time. It can take several seconds before responding. We are investigating.

Report: "P1 - Bots stopped responding to visitors"

Last update
resolved

On Thursday, February 13, there were disruptions related to the bots. From 16:54 to 17:38 CET, the chatbots did not receive visitor messages and were unable to respond to conversations. As a result, those conversations were left unanswered and were closed by the auto-closing bot feature.

Report: "P1 - Perturbations in engagement and conversation creation"

Last update
resolved

This incident has been resolved since 3:48pm.

monitoring

A fix has been implemented, and everything is back to normal. You should be able to see notifications on your Website again. We keep monitoring the production. Thank you.

investigating

You may experience some disturbances on conversation routing since 3:39pm CET. We are currently investigating this issue and its impacts. We keep you informed, thanks for your patience and understanding.

Report: "P1 - Perturbations in engagement and conversation creation"

Last update
resolved

The issue started at 2:33 PM and was resolved by 2:54 PM, resulting in approximately 20 minutes of perturbation that impacted engagement and conversation creation. The display of conversation notifications on the sites was random. Only a fraction of visitors could start a conversation. We apologize for the inconvenience.

Report: "P1 - Disturbances on the Conversations Panel"

Last update
postmortem

**Incident:** On November 20th \(17:19 > 17:52 CET\) and November 21st \(9:30 > 9:43 CET\), we experienced two incidents degrading the user experience on the Conversation panel and Administration. During this period, conversation processing by agents was disrupted by white screens or error messages. In addition, the monitoring of stats reports by managers has also been impacted by error messages. These disturbances are the result of changes made to the platform infrastructure as part of our regular and scheduled system maintenance. Although initially qualified as non major risk and validated in a pre-prod environment, these planned actions had an unexpected impact on platform stability. Access to services critical to the proper operation of the platform have been temporarily cut. ‌ **Resolution** To solve this issue, our technical team had to manually change some settings on these critical services and then to restart them. Getting the required underlying services back to their nominal state allowed the Conversation panel and the Administration application to return to their own nominal state. ‌ **Actions for the future** * \(Done\) Review our internal processes to ensure that customer communication on our [status page](https://status.iadvize.com/) is more responsive * \(Done\) Review our maintenance process to better identify and scope potential negative impacts on the iAdvize platform and adapt our execution plans subsequently * \(Done\) Improve probes and alerting on failing services to improve reactivity ‌ **Focus on the Black Friday period** Looking ahead to the next critical period, we're confident that we'll handle incoming traffic on the iAdvize platform without disruption.  This incident is the consequence of manual actions whose impact has not been adequately anticipated. This is not a problem related to traffic management or platform scaling. In the meantime, we have been proactive in getting the iAdvize platform ready and we reviewed teams' preparation for this high-traffic period. Our modus operandi is based on three pillars which have already been identified and implemented: - freezing period : no new code added in production - stress test : test the platform’s scalability with heavy load pick traffic - team’s mobilization : assigning the right people to monitor the main components 24/7 Be assured that our team and platform are ready for the end of the year.

resolved

After a period of monitoriy, we confirm that this incident is resolved. A post-mortem will be published soon. We are sorry for the inconvenience caused.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We have some disturbances on the platform. You may notice blank page on the conversation panel or on the administration. Also difficulties to close conversations. The technical team is investigating the issue. We keep you informed.

Report: "P1 - Disturbances on the Conversations Panel and Administration App"

Last update
postmortem

**Incident:** On November 20th \(17:19 > 17:52 CET\) and November 21st \(9:30 > 9:43 CET\), we experienced two incidents degrading the user experience on the Conversation panel and Administration. During this period, conversation processing by agents was disrupted by white screens or error messages. In addition, the monitoring of stats reports by managers has also been impacted by error messages. These disturbances are the result of changes made to the platform infrastructure as part of our regular and scheduled system maintenance. Although initially qualified as non major risk and validated in a pre-prod environment, these planned actions had an unexpected impact on platform stability. Access to services critical to the proper operation of the platform have been temporarily cut. ‌ **Resolution** To solve this issue, our technical team had to manually change some settings on these critical services and then to restart them. Getting the required underlying services back to their nominal state allowed the Conversation panel and the Administration application to return to their own nominal state. ‌ **Actions for the future** * \(Done\) Review our internal processes to ensure that customer communication on our [status page](https://status.iadvize.com/) is more responsive * \(Done\) Review our maintenance process to better identify and scope potential negative impacts on the iAdvize platform and adapt our execution plans subsequently * \(Done\) Improve probes and alerting on failing services to improve reactivity ‌ **Focus on the Black Friday period** Looking ahead to the next critical period, we're confident that we'll handle incoming traffic on the iAdvize platform without disruption.  This incident is the consequence of manual actions whose impact has not been adequately anticipated. This is not a problem related to traffic management or platform scaling. In the meantime, we have been proactive in getting the iAdvize platform ready and we reviewed teams' preparation for this high-traffic period. Our modus operandi is based on three pillars which have already been identified and implemented: * freezing period : no new code added in production * stress test : test the platform’s scalability with heavy load pick traffic * team’s mobilization : assigning the right people to monitor the main components 24/7 Be assured that our team and platform are ready for the end of the year.

resolved

This incident has been resolved. More information on this status will be provided in the coming days. Thank you for your patience.

monitoring

From 5:19PM CET to 5:52PM CET, we encountered disturbances on all the platform. You may have noticed blank page on the conversation panel or on the administration. Also difficulties to close conversations. This has been identified, and the technical team has restored the situation. We keep on monitoring the platform.

Report: "P1 - Perturbations in engagement and conversation creation"

Last update
resolved

The issue started at 10:35 AM and was resolved by 10:50 AM, resulting in approximately 15 minutes of perturbation that impacted engagement and conversation creation. The display of conversation notifications on the sites was random. Only a fraction of visitors could start a conversation.

Report: "P1 - Impossibility for visitors to start new conversations"

Last update
resolved

This incident has been resolved

monitoring

We found the root cause of the issue. Yet, a fix is live. All is working normally since 2:40pm (CET) We keep monitoring the production.

investigating

Please note that ongoing conversations are not impacted. Only new conversations can't be created

investigating

It is not possible for visitors to start conversations. This incident is impacting all our services. Our technical team is currently looking into the issue, we will update you as we learn more.

Report: "P3 - Disturbances on conversation routing"

Last update
resolved

After a period of monitoring, we confirm that the incident is resolved. We apologise for any inconvenience caused.

monitoring

We have mitigated the problem and conversation processing is working normally. Our technical teams remain mobilized to ensure that the platform functions normally.

investigating

You may experience some disturbances on conversation routing since 2:10pm CET. We are currently investigating this issue and its impacts. We keep you informed, thanks for your patience.

Report: "P2 - Random disturbances on the conversation panel"

Last update
resolved

During a few minutes from 3:38pm to 3:48pm (CEST), you may have experienced some issues on the conversation panel: - Ongoing messages disappear - Ongoing messages do not send - Conversations cannot be closed This error was random, and not all agents were affected. Our technical team has identified the issue and fixed it. An actualization of the conversation panel will probably be necessary to restore it. Thanks for your understanding.

Report: "P1 - Conversation notifications might not be visible on websites"

Last update
postmortem

**Incident:** On September 2nd, between 11:01 CEST and 13:04 CEST, we experienced an incident impacting the service in charge of the iAdvize engagement \(handling targeting\). ‌ During this timeframe, the display of notifications on our customers' websites and on mobile applications fluctuated between functioning randomly and not being displayed at all. As a result, starting a conversation from Chat / Call / Video / mobile application channels was degraded \(86 min\) or even completely cut off \(37 min\).  Social channels were not impacted. ‌ This unavailability of our engagement service occurred because:  * After a restart, our mirroring service moved to the same server instance as our engagement service * Due to an unexpected resource usage spike on the mirroring service, the engagement service was left with insufficient resources to scale and run properly ‌ **Resolution** To solve this issue, we manually isolated our mirroring service to different server instance, ensuring the engagement service had enough resources to run properly again. ‌ **Actions for the future** * \(Done\) Isolate our mirroring service away from other critical services * \(Done\) Analyze the causes of the resource increase on our mirroring service, and implement optimizations to reduce its resource usage * \(Done\) Improve alerting alerting in case of network resource issue on server instances

resolved

This incident has been resolved.

monitoring

Please know that a fix is live. You should be able to see notifications on your Website again. We are monitoring this.

identified

We are still on it. We are performing actions to resolve the issue.

investigating

We are continuing to investigate this issue.

investigating

Please know that we are facing an issue: The notification might not appear on your Website. We are working on it.

Report: "P3 - Conversation notifications might not be displayed on some websites"

Last update
resolved

From 11:05am CEST to 2:51pm CEST on the 22nd of August, you may have noticed that no chat notification was displayed on your website. This was caused by a deployement made by our technical team, which introduced a regression. Not all the clients were impacted. The issue was addressed as soon as we noticed it. We are sorry for any incovenience this may have caused.

Report: "P2 - Since the 7th of August 6pm CET, our presence and occupation reports are not updated"

Last update
postmortem

## **Incident** From the 7th of August 8:04 pm CEST to the 9th of August 9:38 am CEST, the presence reports were no longer updated. We hit a size limit issue on a field in one data table that prevented any update. Thus, the state of the agents remained unchanged \(connected or disconnected\) until the incident was resolved.  Unfortunately, we can’t recover the state of the data over the period of time of the incident. ## **Resolution** We changed the field data type in the presence table and its format to unlock the size limit and allow for durable storage of presence events.  Despite the return to normal on August 9th, we noticed some desynchronization in the agents' states. A manual cleaning of agent presence was required to solve this. Some isolated cases may remain for some agents. If you are facing this situation, please recommend your agents to trigger the disconnection/reconnection action to record again the presence events. ## **Actions for the future** * \(In Progress\) Audit of the tables that could face the same issue and run the fix \(format change\) in anticipation * \(Done\) Strengthen probes to be warned in real time when presence events do not flow as expected

resolved

After a monitoring period, we confirm that the incident has been resolved. The presence and occupation data are once again being recorded. However, during the period of the incident (from Wednesday 7 August 6:03pm CET to Friday 9 August 10:57am CET) the presence and occupation events were not recorded. Therefore, you may notice inconsistencies in the reports related to these indicators. On behalf of iAdvize, we apologise for any inconvenience caused.

monitoring

We confirm that the presence and occupation data are being recorded again. We are now working on resynchronising any user account that require it. As no presence or occupation events were recorded during the incident, reports linked to this data may display inconsistencies during this period.

identified

We have finished an upgrade that ran all night. We are currently validating that the presence data in the reports is calculated again. We are sorry for the inconvenience caused.

identified

We are continuing to work on resolving this problem. Impact is on statistics only: the agents' presence and occupation events are not recorded since Aug 7th at 6pm CET. Therefore you may see inconsistencies in the presence and occupation reports, as well as on the presence indicator in the Supervision. However there is no impact in handling conversations nor in conversation volume. We are sorry for the inconvenience caused.

identified

We are still working on a fix. Thanks for your patience.

identified

We are continuing to work on a fix for this issue. Presence and occupation events are still not recorded in the reports.

identified

Since the 7th of August 6pm CET, our presence report is not updated with any occupation or presence events. Our technical team has identified the root cause and is correcting it. We keep you informed.

Report: "P1 - Service disruption on Bots"

Last update
postmortem

## **What happened ?** We had a severe load issue on our Bots backend services preventing Bots from being functional on your websites.Conversations handled exclusively by humans were still functional. However, if a bot intervened in the engagement flow used by visitors, the conversation could not take place. This load issue was caused by an unscheduled self-cleaning script performed on the Bots database engine. This cleaning was performed on a table with a large amount of data. As a consequence, critical queries needed by Bots were not able to perform in reasonable time, making the whole Bots system degraded. This issue happened on October 26th between 11:10 to 15:47 CEST. ## **Resolution** Once started, the self-cleaning script cannot be stopped and must be completed. So we looked for alternative solutions. In order to mitigate and restore the Bots services, we performed following actions:  * Upgrade Bots database engine to a higher-performance instance type in order to let Bots critical queries to be fully executed. This action took several hours to complete. This partly explains the duration of the incident. * Deploy a new version with patches to reduce the bots' dependence on the overloaded database. ## **Actions for the future** * \(Done\) The bots database engine upgrade significantly reduced the permanent load. We have more capacity to handle heavy loads. * \(Done\) We have identified and cleaned up the data table at the origin of the self-cleaning. * \(In progress\) We are setting up new probes to detect loaded databases and avoid self-cleaning script launches.

resolved

The latest intervention we accounced occurred succesfully. Since the bot service has been restored (3:47pm CEST), our montoring shows that everything is back to normal. The incident is closed.

monitoring

In order to stabilize the bot service, our technical team will apply a patch around 7.30pm CEST. During this time, you may experience some disturbances. We will do our utmost to minimize all impact on the production. In advance, thank you for your understanding.

monitoring

The incident is now resolved. However, you may have some conversations stucked display on the ongoing report Production, during this period. Our technical team is doing its best to close them.

monitoring

Our latest actions have restored the bot. The situation is now back to normal, our technical team continues the monitoring of the service. Thanks again for your patience during the resolution of the incident.

identified

Personal canned answers have been restored. Your agents can now use them normally. Our technical team keeps working on new actions to restore the bot service. Thanks.

identified

Our technical team is still working on technical interventions to solve the issue. One action taken involved temporarily suspending our service dedicated to personal canned answers. As a result, your agents may notice that some personal canned answers normally available are currently missing.

identified

Our technical team is still working on several actions to fix the bot service. Thanks again for your understanding and patience.

identified

The cause of the incident has been identified, we are actively working on a fix to correct this issue as soon as possible. Thanks again for your patience.

investigating

We are still investigating and attempting to mitigate the issue on bots. We will update again as we have more information. Thank you for your patience.

investigating

We are investigating on a problem concerning bots. Bots may not be able to reply conversations. Thanks for your understanding.

Report: "P2 - Conversation panel and livechat were unavailable"

Last update
resolved

This morning, from 09:03 to 09:09 CEST, we experienced instabilities on the conversation panel and the livechat. The technical team intervened to stabilize the platform. Situation is back to normal, we are sorry for the inconvenience.

Report: "P1 - Bot scenarios failling"

Last update
resolved

Now incident is close. Everything works fine.

monitoring

Issue solved we are monitoring

identified

We are continuing to work to fix this issue

identified

The issue has been identified and a fix is being implemented.

Report: "P2 - iAdvize services degraded"

Last update
resolved

Our technical team has performed an intervention restoring the stability of all iAdvize services. Everything is running normally, this is incident is closed.

identified

After further investigation this stability issue is impacting all services. It leads to a partial degradation of our services, so users may encounter random errors when using iAdvize.

identified

Our mobile SDK service is experiencing stability issues, leading to random errors. This service is working but is degraded, our technical team is working on it. All the other iAdvize services are operating normally.

Report: "P2 - Delay in statistics update"

Last update
resolved

This incident has been resolved. Thank you for your patience.

monitoring

Following the intervention of our technical team, the delay in the reports update has been resolved since 9:50 (CEST). Statistics are now up to date, thank you for your patience. We are continuing to monitor the situation.

investigating

We are currently experiencing a slowdown on our statistics service. It is impacting the update of all iAdvize reports, our technical team is currently working on solving this issue.

Report: "P0 - Service disruption - Impossible to start new conversations & errors on the conversation panel"

Last update
postmortem

## **Incident** Following a [major maintenance on our Livechat database](https://status.iadvize.com/incidents/3n39s39s99p7) \(DB\) to upgrade to a new version, scheduled on Feb. 22th 6:30 > 8:00 CET, we encountered CPU scalability problems as traffic on the new instance began to increase with the start of the day in Europe.All active connections to this DB slowed down until they reached a timeout. At this moment, the DB was frozen and unreachable.As this DB is central to the Livechat app, we were faced with a generalized interruption in conversation processing across all channels \(chat, call, video, whatsapp, facebook, …\) supported by the iAdvize platform. Downtime on conversations processing happened on Feb. 22th, between 9:25 > 10:50 CET. ## **Resolution** As soon as we became aware of this incident, we shut down the services displaying contact notifications. This is to prevent visitors from trying to start conversations that cannot be handled by the system.Afterwards, we had to shut down several services linked to this DB and manually kill backend processes in order to mitigate the problem and decrease CPU load. Once the CPU level was acceptable again, we ran a system checking script to verify DB integrity and optimize its operation. Finally we were able to restart all services one by one without risking a new CPU burst. ## **Actions for the future** * \(Done\) Identification and setting up a throttle system for next DBs migrations. The aim here is to allow a gradual increase of incoming traffic on new instances, in order to keep control over CPU and memory load. * \(In progress\) Improving our internal processes and tools to optimize DB crash resolution time.

resolved

During the monitoring performed since the last update, no issue has been identified. As stated earlier, this incident is resolved since 10:50am CET. We are now closing this status incident publication.

monitoring

The incident is over since 10:50am CET. The conversation processing flow is now operational. Thank you for your patience.

monitoring

Following technical interventions we are noticing great improvements. iAdvize services have been restarted gradually, while monitoring the results. You should notice improvements on your side: - On the visitor side: Notification are visible on the website and conversations can be started - For agents: New incoming conversations can arrive on the conversation panel - For admins & managers: Inconsistencies could still be seen on some reports, we are currently working on it. We are monitoring the situation, we will communicate again when we notice the situation is fully back to normal.

investigating

Our technical team is still actively working on this issue to mitigate this incident. Several interventions have been performed, but the situation is still not back to normal. We continue to notice impacts on several services: - On the visitor side: It is impossible to start conversations, no notification is visible on the website - For agents: Different kinds of impacts, like errors on the conversations panel when handling opened conversations, no new incoming conversations - For admins & managers: Inconsistencies on some reports (production report does not show connected agents, conversation report does not show past conversations)

investigating

We are still investigating and attempting to mitigate the issue. We will update again as we have more information. Thank you for your patience.

investigating

We are currently experiencing service disruptions, so you may notice issues on several services: - It is impossible to start new conversations - Errors on the conversations panel on opened conversations - Inconsistencies on some reports (production report) Our technical team is currently working on this issue.

Report: "P0 - Access disruption - Conversation panel not available"

Last update
resolved

This incident has been resolved.

monitoring

The cause of this issue has been identfied, and all is now back to normal. iAdvize notifications are displayed on the website and visitors can initiate conversations again. We will keep monitoring the situation for the coming hours.

investigating

The cause of this issue has been identfied. An intervention has been performed and the iAdvize conversation panel is now accessible again. However we still notice errors preventing the display of iAdvize notifications on your websites, our technical team is working on it

investigating

The iAdvize conversation panel is currently not available. We are working on the problem to get it resolved as soon as possible.

Report: "P2 - Bot conversations with Copilot leading to technical error responses"

Last update
resolved

This issue has been resolved completely, and the production of Copilot-powered bots is back to normal.

monitoring

A fix has been applied and conversations are no longer systematically experiencing technical errors. We continue to monitor the situation.

investigating

Currently, conversations with Copilot result in technical errors. Our team is working to fix this issue.

Report: "P2 - Disturbances on the conversation panel"

Last update
resolved

During a few minutes from 12:04 to 12:44pm, you may have experienced some issues on the conversation panel: - Ongoing messages disappear - Ongoing messages do not send - Conversations cannot be closed Our technical team has identified the issue and fixed it. An actualization of the conversation panel will probably be necessary to restore it. Thanks for your understanding.

Report: "P2 - Update delay on sales & presence statistics"

Last update
resolved

We can now confirm that the statistics are up-to-date. Thank you.

identified

Following an intervention by our technical team, the data of our Sales and Presence reports is now gradually updating. However due to the large volume of data to process, it will take several hours before the data is 100% up to date. We will publish another message when all is back to normal, thank you for your patience.

investigating

Since Sunday Feb 11th we are experiencing a long update delay on the statistics of the Sales, and also the Presence reports. As a consequence for these 2 reports data is currently not up to date. Our technical team is working on this topic.

Report: "P1 - Impossible to receive new Facebook Messenger conversations"

Last update
resolved

All past Facebook messages have been processed, new messages will immediately be forwarded to iAdvize.

monitoring

Our technical team has identified the cause of this issue and a correction has been made. All Facebook messages sent by your clients from this morning will gradually be forwarded to iAdvize. Since there is a lot of messages to process, we expect this step to take around 30min to 1 hour. After that, new messages will be immediately forwarded to iAdvize as usual.

investigating

Since this morning around 8:30am (CET), it is impossible to receive new Facebook Messenger conversation through iAdvize. Our technical team is currently working on it, other channels are not impacted.

Report: "P1 - Service disruption - iAdvize notifications not displayed on website"

Last update
resolved

Following a period of monitoring, we can confirm that the platform's situation is stable.

monitoring

We have fixed the issue with a patch and your website notifications are now functioning properly again.

investigating

There is an issue we are currently investigating with notifications, such as chat buttons, invitations, and messages, not appearing on your website. This problem is resulting in a decrease in the number of incoming contacts.

Report: "P1. Bots are no longer operational (they no longer respond)"

Last update
postmortem

## **Incident** We had a lag issue on our Bots backend services preventing Bots managed by iAdvize from being functional on your websites.Conversations handled exclusively by humans were still functional. However, if a bot intervened in the engagement flow used by visitors, conversations stopped at the first stage of the bot scenario. This lag issue occurred following the release of a version containing the first building blocks of a feature that will soon be available. This release successfully passed all our validation protocols. However, the increase in load that followed the release generated a significant lag in the incoming conversation ingestion service. As a consequence, these incoming conversations exceeded the maximum execution time and were discarded from processing. This issue happened twice on November 29th : - 16:45 to 17:22 CEST - 17:27 to 17:37 CEST  ## **Resolution** In order to mitigate and restore the Bots services, we performed following actions:  * Manually clean up the event overflow in the incoming conversation ingestion service  * Rollback of the release that introduced the lag ## **Actions for the future** * \(Done\) Add parallelization processes on the events consumers in order to be reactive in cause of lags on bots * \(Done\) Put a limit on the events publisher in order to prevent possible lags on bots

resolved

This incident has been resolved. All conversations stucked during the incident have been manually closed. Thank you for your patience.

monitoring

We note that some conversations that took place during the incident are currently still in progress, but for which the bot is no longer responding. We are going to close these conversations manually.

monitoring

We have seen a return to normality in the last 5 minutes following our latest actions. We are continuing to monitor the situation.

investigating

We are seeing new disruptions appear. Bots are taking a long time to respond or are no longer responding in some cases. We are actively working to resolve the problem

monitoring

The service has restarted and we are seeing a return to normal.

investigating

We have noticed that the bots are no longer responding. The technical team is working to resolve the problem. We're going to restart the service to restore it.

Report: "P3 - Agents were not able to snooze conversations"

Last update
resolved

From Nov 29 around 6:00PM CET to Nov 30 9:37AM CET, agents may have encountered difficulties to snooze their conversations: the snooze modale did not remain open. This was due to the "Product tour" we activated to introduce the new feature of auto-closure of the snoozed conversations. The "Product tour" has been deactivated, the situation is back to normal (the conversation panel page may need to be refreshed). We are sorry for the inconvenience caused.

Report: "P3 - Long response time on Bots"

Last update
resolved

Today between 4:03pm and 5:00pm (CET) we have experienced slowdowns on our bot service. The consequence was a longer response time from bots. So visitors may have waited longer than usual before receiving a reply in their conversations with bots.

Report: "P1 - Slowdowns on our platform mainly on the desk app and statistic reports"

Last update
resolved

Between 10:10 and 10:26 am (CEST) this morning, we experienced slowdowns on the platform with difficulties in displaying the live activity report and the desk. It is resolved since 10:26 am. Thanks for your understanding.

Report: "P2 - Incident Meta: It is no longer possible to send replies on the Messenger channel."

Last update
resolved

Following an incident at Meta, it was no longer possible to send messages from the Meta Messenger API between 15 November at 9pm CET and 16 November at 4am CET.

Report: "P2 - Chatbox did not appear after clicking on an engagement notification"

Last update
resolved

Between 3:43pm and 3:48pm (UTC+2), when a visitor clicked on an engagement notification, the chatbox did not systematically open. The problem has since been resolved.

Report: "P0 - Connection disturbance on the platform"

Last update
resolved

Our monitoring indicates that everything returned to normal after our technicians' last intervention. The incident is now closed.

monitoring

Our platform experienced major connection outages for users from 07:34 p.m. to 07:44 p.m. CEST. The technical team has stabilized the situation and is continuing to monitor it.

Report: "P1 - Visitor engagement service (chat/call/video) no longer operational"

Last update
resolved

After this monitoring period, we confirm that the situation has returned to normal. Thank you for your patience and understanding.

monitoring

We have confirmed that things have returned to normal in the last few minutes, and are monitoring activity on the platform.

identified

We are continuing our actions and seeing improvements. Engagement notifications are appearing again, but we have not yet returned to pre-incident volumes.

investigating

We have identified the source of the problem and are working to resolve it. The administration engagement section is also currently unavailable. Thank you for your patience.

investigating

We're seeing a high number of errors in the visitor engagement service (triggering engagement rules and displaying notifications). We are working on resolving the problem.

Report: "P2 - Delay to display statistics"

Last update
resolved

This incident has been resolved.

monitoring

An initiative has just been launched to stabilize the statistics problem. We will continue to monitor for any further problems.

monitoring

You may see stats disappear again. We are working on the issue. Please be sure that: - Conversations will remain unaffected as production will continue to run smoothly. - The only service temporarily disrupted will be access to statistics. - There will be no risk of data loss; once our next fix is implemented, we will recover all data.

monitoring

Stats are back now. Please note that none of them are missed. We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

An incident is currently ongoing on our statistics service. You won't be seeing updated statistics for now. We are currently working on the restoration of the service.

Report: "P1 - Difficulty to load the desk"

Last update
postmortem

## **Incident** We have encountered instabilities in the loading of the discussion panel and mobile app for agents. Instead of loading the expected interface on desktop, a blank screen was displayed to the agents.Regarding mobile app, an error was displayed as soon as agent IDs were entered.These regressions prevented agents from processing the incoming flow of conversations.  These instabilities occurred : - August 11th : 9:35 > 10:42 CEST ## **Reasons** We are currently working on some major innovations that will be live in a few weeks' time.The first pieces of this in-depth work are starting to be delivered in production. Although inactive for our end users, they impact the source code already in production.The incident came from one of these supposedly transparent deliveries. It introduced a new environment variable required to start the discussion panel and mobile app.  On our build and test environments, no problems were detected as this variable is instantiated automatically on the discussion panel and mobile app login.On production servers, due to the fact that all new pieces are not online yet, the variable has not been instantiated as expected. It generated an internal error. ## **Resolution** We solved the problem by taking several successive actions : - Our probes and automated non-regression tests detected the incident within minutes of deployment to production. We therefore proceeded to a rollback of the release. This action did not restore the solution, as a database schema update accompanied the release.- We had to urgently develop a hotfix to bypass the database schema update. ## **Actions for the future** - \(Done\) \(Tech\) Add new Unit tests on build and test environments in order to check newly added environment variable integrity  - \(Done\) \(Tech\) Remove dependencies between in progress developments but not visible yet and initial discussion panel/mobile apps loading

resolved

This incident has been resolved.

monitoring

A fix has just been implemented and we confirm that the problem has been resolved. We invite affected users to refresh their browser to reload the desk correctly. We are monitoring activity.

identified

We have identified the source of the problem and a patch is currently being deployed.

investigating

We're seeing instability when it comes to connecting to the desk. The problem is probably random and does not affect all users. We are currently investigating.

Report: "P1 - Some bots not able to handle conversations"

Last update
postmortem

**1 - What happened?** Over the past few days, we've encountered several instabilities in the processing of conversations by iAdvize bots. Instead of unfolding the expected scenario, the bots failed to process visitors' messages. As a result, the bots were displaying error messages or no response at all, resulting in a severe degradation of the user experience on your websites. These instabilities occurred : * August 11th : 9:57 > 11:37 CEST * August 15th : 4:45 > 7:54 CEST **2 - What caused the outage?** We are currently working on a major revamp of the technical core of iAdvize bots. The aim of this redesign is to improve the speed of execution of bots' scenarios, and make them easier to maintain especially during future technical updates. This work includes the implementation of a new service dedicated to the reception of new conversations by bots. These conversations are then distributed to a second service, which executes the scenario defined in iAdvize administration.Instabilities occurred on this new service due bot messages parsing. Unknown format messages were pushed to the system.This resulted in a delay in the processing of new bot conversations.We have identified that on 2 occasions. * On August 11th, the delay was small, and only a third of the bots managed by iAdvize were affected. * On August 15th, despite the patches applied after previous instability, we experienced a new problem. The accumulated delay was enough to interrupt all bots conversation processing. **3 - What was the fix?** * On August 11th, we mitigated the problem by manually identifying and restarting the instances of the bot conversation reception service that were causing problems. * On August 15th, our new probes detected a new delay. We also restarted faulty instances. However, these actions didn't work as expected. A new problem with bots message parsing was discovered. We had to urgently develop a hotfix to unblock a message type that was not recognized by the system. **4 - How will iAdvize prevent this issue in the future?** * \(In progress\) \(Tech\) Technical audit of the new bot service in order to identify any new potential failure points * \(Done\) \(Tech\) Improve our probes to detect potential delays in the processing of new conversations by bots * \(Done\) \(Tech\) Improve bot service reliability by adding safeguards to prevent bots from blocking the processing of new conversations in the case of unsupported message types

resolved

After a period of monitoring, we confirm that the incident has been resolved. Thank you for your patience and understanding.

monitoring

The situation is now back to normal, our technical team continues the monitoring of our infrastructure.

investigating

We are currently investigating an issue on our bot service. Most of the bots are not able to handle conversations, our technical team is working on solving this problem.

Report: "P2 - Problem initiating conversations with bots."

Last update
resolved

This morning between 10am CEST and 11:30 CEST, we're seeing errors when bots are taking over conversations. In some cases, visitors could get an unavailability message after initiating a conversation with a bot. The problem was not systematic (1/3 of bot conversations affected) and was identified and resolved by our teams. As of 11:30 CEST, the service is now back to normal.

Report: "P2 - WhatsApp : Problems sending and receiving messages"

Last update
resolved

After several days of monitoring the platform, we've noticed no more errors. An incident is still open on Meta's status, but the deployed hotfix is working. We therefore decide to close this incident.

identified

Since August 2, we've been experiencing problems sending and receiving messages on the WhatsApp channel. As a result, you may notice a drop in the volume of incoming conversations, as well as message sending errors within the desk of connected agents. WhatsApp has reported several incidents on its own status page (On-Premises Solution section), which we invite you to follow here to keep up to date with developments on their side: https://metastatus.com/whatsapp-business-api We are doing our utmost to mitigate the impact and invite you to contact the support team if you experience any of the problems described above at help@iadvize.com. We thank you in advance for your patience and understanding.

Report: "P3 - Errors on the iAdvize conversation panel & iAdvize conversation API"

Last update
resolved

Between 1:53pm and 2:06pm (UTC+2), we have experienced disruptions on our conversation service. As a consequence agents may have encountered errors on their conversation panel while handing conversations from any channels (receiving / transfering / closing). During this time you may also have noticed errors while using our conversation API. All is back to normal since 2:06pm.

Report: "P2 - Some bots not able to handle conversations"

Last update
resolved

After a period of monitoring, we confirm the situation was back to normal since 9:38am (UTC+2)

monitoring

A technical intervention has been performed, the situation is back to normal and bots can now handle new conversations.

investigating

We are currently investigating an issue on our bot service. Most of the bots are not able to handle conversations, our technical team is working on solving this problem.

Report: "P2 - Sales & Presence Statistics update delay"

Last update
resolved

Sales & Presence data is up to date since 0:00 (UTC), all is back to normal.

monitoring

Data update is still slowly catching up. It will still take several hours before the Sales & Presence reports are up to date.

monitoring

After a technical intervention, data is now gradually being updated for Sales & Presence reports. Due to the large amount of data, it may take some time before these reports are up to date.

identified

We have detected a delay in the update of the Sales & Presence statistics. For these 2 reports: - Data for May 7th is not fully updated - Data for May 8th is not updated at all. No information is lost, it is only an update delay on the reports. Our technical team is working on it.

Report: "P2 - Delay in the indexation of statistics"

Last update
resolved

After a monitoring period, we confirm the resolution of the incident and thank you again for your patience.

monitoring

Our technical team has identified the cause of the problem and has performed an intervention. All reports are now updated and no delay is noted on new statistics updates. We will keep monitoring the situation for the coming hours.

investigating

Our technical team is still mobilized to solve this statistics issue, impacting the update of closed conversations in the iAdvize reports.

investigating

We are currently experiencing a delay in the indexing of statistics, which has an impact on the recording and updating of closed conversations. Our technical team is currently investigating the problem. Thank you for your patience.

Report: "P2 - Conversation distribution issue - Conversations stuck in pending mode"

Last update
resolved

This incident has been resolved. Thank you for your patience.

monitoring

The situation is back to normal since 9:43 (UTC+2). All pending chats are being distributed as usual, our technical team will keep monitoring the situation to be sure everything is stable now.

identified

The cause of the issue has been identified. An intervention has been performed and we are noticing improvements on the chat distribution. Pending chats are gradually being distributed and the situation is slowly coming back to normal.

investigating

We are currently experiencing an issue on the conversation distribution. Conversations in pending are sometimes not being distributed to agents. Our technical team is currently investigating this issue.

Report: "P2 - Mobile application - Video not possible for agents connected on the iAdvize mobile app"

Last update
resolved

Please know that the fix is live. So, the problem is resolved. Thank you.

identified

We have identified the origin of the functional regression. Our technical team is working on a fix. We expect a resolution within the day. Thank you for your patience.

investigating

The video channel is currently not working for agents connected to the iAdvize application (iOs & Android). Our technical team is currently investigating the cause of this problem. There is no issue for agents connected to iAdvize from a computer.

Report: "P1 - Conversation distribution stopped"

Last update
resolved

Impacts: During the incident, new conversations could not be routed. Connected agents could see an error in their iAdvize desk. 1 - What caused the outage? We have noticed a progressive accumulation of deadlocks on the reading and writing of our database managing the distribution of conversations. The iAdvize platform started to become unresponsive to conversation creation & conversation distribution. Our probes detected this anomaly but due to low volume of conversations at that time, on-call teams have not been warned. This issue happened on March 26th between 05:03 (CET) to 06:57 (CET) 2 - What was the fix? The distribution of conversations to agents has been restored following an automatic restart of our routing service. The accumulation of locks on the database finally reached a threshold initiating this restart. 3 - How will iAdvize prevent this issue in the future? (Done) (Tech) Fix the bug generating deadlocks on the distribution database (Done) (Tech) Adjust our probes and alerting in order to warn on-call teams even on low volume (Done) (Tech) Refine our health checks to be proactive with restarts when reaching database congestion

Report: "P2 - Problem of calculating the composition of routing groups impacting the distribution of conversations"

Last update
resolved

The problem is now solved and the impacted routing groups are well up to date. We close the incident and thank you again for your patience

identified

Following the introduction of a regression this morning on a component used in the method of calculating the composition of routing groups (knowing which agents should be attached to a routing group), we note that the routing groups modified between 9am (CET) and 12pm (CET) today may be affected by this problem As a result, these groups may contain fewer users than they should and therefore some users may no longer receive conversations. We are currently working to resolve this issue by recalculating the composition of all affected distribution groups. We will get back to you as soon as the problem is resolved and thank you in advance for your patience.