miLibris

Is miLibris Down Right Now? Check if there is a current outage ongoing.

miLibris is currently Operational

Last checked from miLibris's official status page

Historical record of incidents for miLibris

Report: "Incident on API V4"

Last update
resolved

The service has been restored at 10.15 am

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We’re currently experiencing an issue with our API. Our technical team has been working on it since 9:00 AM and is actively working on a fix. Thank you for your understanding.

Report: "Incident on API V4"

Last update
Resolved

The service has been restored at 10.15 am

Update

We are continuing to investigate this issue.

Update

We are continuing to investigate this issue.

Investigating

We’re currently experiencing an issue with our API. Our technical team has been working on it since 9:00 AM and is actively working on a fix. Thank you for your understanding.

Report: "API issue"

Last update
resolved

The service has now been restored. Our technical team is continuing to monitor the situation to ensure stability.

investigating

We’re currently experiencing an issue with our API. Our technical team has been working on it since 8:00 AM and is actively working on a fix. Thank you for your understanding.

Report: "Ongoing Incident – API"

Last update
resolved

This incident has been resolved.

monitoring

The service has now been restored. Our technical team is continuing to monitor the situation to ensure stability. Thank you for your patience and understanding.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We’re currently experiencing an issue with our API. Our technical team has been working on it since 8:00 AM and is actively working on a fix. Thank you for your understanding.

Report: "Current incident on API V4"

Last update
resolved

The incident is closed. The miLibis push service will remain down this weekend for early-week maintenance. We apologize for the inconvenience this morning, lasting a few dozen minutes.

monitoring

The service has returned to normal since 10 a.m. but our tech team remains on call to ensure that it is sustainable.

investigating

The service is experiencing instabilities. Our technical team is currently working to isolate the problem and restore service.

Report: "Incident on API V4"

Last update
resolved

This incident has been resolved.

identified

The situation has just returned to normal, and our teams are continuing to monitor developments.

identified

Our technical team is working to restore service as quickly as possible.

Report: "Ralentissements de service sur les systèmes de la publication"

Last update
resolved

English message will follow Nous sommes heureux de vous informer que la situation a été résolue avec succès. Tous les titres ont été publiés et nos équipes ont confirmé que le problème a été entièrement résolu. Nous tenons à exprimer notre gratitude pour votre patience et votre compréhension tout au long de cet incident. Soyez assurés que des mesures ont été prises pour prévenir toute récurrence de ce type de problème à l'avenir. Nous continuons à surveiller l'évolution de la situation. --- We are pleased to inform you that the situation has been successfully resolved. All titles have been published, and our teams have confirmed that the issue has been fully addressed. We would like to express our gratitude for your patience and understanding throughout this incident. Rest assured, measures have been implemented to prevent any recurrence of such issues in the future. We are continuing to monitor for any further issues. Thank you once again for your continued support.

monitoring

We are continuing to monitor for any further issues.

monitoring

English message will follow La situation se stabilise progressivement et la plupart des titres sont publiés ou en cours de publication. Nos équipes restent vigilantes et continuent de surveiller la situation de près. Nous sommes sincèrement désolés pour les désagréments occasionnés et vous assurons que des mesures sont en cours d'élaboration pour éviter que ce type de problème ne se reproduise. Merci de votre compréhension. --- The situation is gradually stabilizing, and most titles are either published or in the process of being published. Our teams remain vigilant and continue to closely monitor the situation. We sincerely apologize for any inconvenience caused and assure you that measures are being developed to prevent this type of issue from recurring. Thank you for your understanding.

identified

English message will follow Nous rencontrons actuellement des ralentissements dans notre système d'ingestion de contenu. Nous avons identifié plus tôt la source de la perturbation et sommes progressivement en train de rétablir le service. Cependant, il pourrait encore falloir quelques heures pour que l'ensemble des publications en cours soient traitées et publiées. Nous vous remercions de votre patience et nous excusons sincèrement pour tout désagrément que cela peut engendrer. Nous vous tiendrons informés dès que la performance de notre service sera complètement restaurée. --- We are currently experiencing slowdowns in our content ingestion system. We identified the source of the disruption earlier and are gradually restoring the service. However, it may still take a few hours for all ongoing publications to be processed and published. We thank you for your patience and sincerely apologize for any inconvenience this may cause. We will keep you informed as soon as our service performance is fully restored.

identified

English message will follow Nous avons identifié la source du ralentissement du service et sommes en train de valider la reprise normale des activités sur nos serveurs. Les services de publication devraient reprendre progressivement au cours des prochaines minutes. --- We have identified the source of the service slowdown and are currently validating the resumption of normal activities on our servers. Publication services should gradually resume over the next few minutes.

investigating

English version will follow. Nous rencontrons actuellement des ralentissements dans le traitement et l'ingestion des contenus. Nos équipes techniques sont mobilisées et travaillent activement à résoudre cette situation dans les plus brefs délais. Nous vous remercions de votre patience et nous excusons pour tout désagrément causé. Nous vous tiendrons informés dès que la performance de notre service sera rétablie. --- We are currently experiencing slowdowns in processing and ingesting content. Our technical teams are mobilized and actively working to resolve this situation as quickly as possible. We thank you for your patience and apologize for any inconvenience caused. We will keep you informed as soon as the performance of our service is restored.

Report: "Intermittences de service"

Last update
resolved

Un nouveau correctif a été apporté par l'équipe, et la plateforme regagne en stabilité. Notre équipe reste en veille pour les prochaines heures afin de s'assurer de la continuité du service. --- A new patch has been implemented by the team, and the platform is regaining stability. Our team will remain on standby for the next few hours to ensure the continuity of the service.

monitoring

Nous avons des intermittences dans le service d'appel à notre API, ce qui provoque des erreurs 500 sur nos kiosques. Nos équipes sont à l'œuvre pour stabiliser la situation et rétablir le service. --- We are experiencing intermittencies in the call service to our API, which is causing 500 errors on our kiosks. Our teams are working to stabilize the situation and restore the service.

Report: "core apiv4 partial outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

Some routes of the API are failing, we are investigating the root cause of the problem. The service is partially impacted.

Report: "major outage on content availability following an incident on our storage system"

Last update
resolved

Everything is now up and running. Our core storage array started to malfunction around 4:00am this morning, making our data partially unavailable for the rest of our services. The root cause appears to be a hardware bug on the controller managing the array, and we are starting an in-depth analysis to determine the next corrective actions.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "major network outage"

Last update
resolved

This incident has been resolved.

monitoring

All the services are restored, we keep monitoring the situation to ensure that everything is working properly.

investigating

Our provider is experiencing a major network outage, all miLibris services are currently affected.

Report: "Services affected by an outage"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

All services are back online and we are monitoring the situation

investigating

We are currently experiencing an outage on several services, we are working on solving this issue and will keep you updated. Affected servcies are : console management, ftp uploads and support website.

Report: "Major outage between 2:30 AM and 3:20 AM"

Last update
resolved

Our core database suffered a network outage for 50 minutes tonight, between 2:30 AM and 3:20 AM. The incident has been resolved, but we are still investigating disruption of our services.

Report: "Documentation and FTP stats services down"

Last update
resolved

This incident has been resolved.

monitoring

The documentation website is back up and FTP service for the stats exports is back online. We have regenerated exports of the 09/03.

identified

At 4AM this night, our hoster lost one of its datacenter to fire. All backup components are up and running except the following services: * documentation website * FTP stats exports We are actively working to have those services back online as soon as possible. Thank your for your comprehension

Report: "core api shortage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are experiencing a shortage on our core API, we are investigating the root cause of this issue.

Report: "network outage"

Last update
resolved

This incident has been resolved.

monitoring

The root cause of the outage is a faulty network infrastructure in our hoster datacenters. Our services are coming back online and we are monitoring our servers to ensure all services are properly restored.

investigating

We are continuing to investigate this issue.

investigating

We are experiencing a network outage affecting most of our services.

Report: "core api shortage"

Last update
resolved

This morning our core api experienced a major shortage between 05:45 and 08:20 CET. Multiple external services that we depend on for specific api requests suffered a network outage, causing a denial of service. We are currently working on several updates to ensure proper isolation and improve global robustness against such problems. Thank you for you comprehension.

Report: "core api degraded performance"

Last update
resolved

The networking problems have been resolved and the performance is back to normal.

investigating

Our core database cluster is affetected by a networking problem. We are currently investigating the issue with our hoster. All systems are operational but response time on our core API is degraded. We are sorry for the inconvenience and we will keep you informed about any further developments.

Report: "Partial outage of api and publication"

Last update
resolved

This incident has been resolved.

monitoring

We have isolated the source of the problem, but a high load is to be expected this morning. Also, publications scheduled between 02:30 and 02:50 ETC, 06:00 and 06:45 AM ETC might have encounterd a failure.

identified

We are experiencing a problem with an external service affecting the core api.

Report: "Access to our services affected by maintenance."

Last update
resolved

This incident has been resolved.

monitoring

We believe that we have fixed most of the issues, but some services might still work with degraded performance.

monitoring

Network outage has caused serveral issues on our core components. We working on fixing them.

monitoring

Our network provider is performing updates on their systems, which is likely to affect access to our service in the coming hours. We expect the service to return to normal at 4 a.m. (UTC). We apologize for any inconvenience caused and thank you for your patience.

Report: "API core servers experienced a major slowdown"

Last update
resolved

Le dysfonctionnement d'un service externe a entrainé la saturation du service d'API.

Report: "Outage on content delivery."

Last update
resolved

We suffered a network outage this morning affecting several of our CDN servers. The problem is now resovled.

Report: "Heavy load on CDN"

Last update
postmortem

Saturday morning, we had a 2h service perturbation due to an overload of our CDN : we exeed our bandwidth limits. After fixing this, it was time to know what really happened. Indeed, morning spikes is part of our business : we monitor our bandwidth and have a very large allowance on this. ### Problem origin The problem was in fact in our processing chain. We use a pool of 20 dedicated servers to convert client inputs and package final content before sending it through our delivery infrastructure. All these 20 servers are sync with a central configuration manager. We notice that there was some problem to sync configuration files, in particular the configuration of the PDF / assets size optimization process : Some of our servers was not able to optimize file size. An increasing number of our release was sent in our delivery infrastructure without any optimization. Some of them was larger than 500mb, and increase drastically the amount of bandwidth needed. This was now fixed and we are back to a normal situation.

resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "API Response time"

Last update
postmortem

Sunday and monday morning, we had downtimes due to the poor response time of our main database. We want to share here what happened with full transparency. ### Problem Our main database servers was not able to execute the number of transactions per time frame that is actually required by the applications servers. We monitored a high resources utilization (CPU, Memory, ...) and an excessive locking of the database requests. At the time we write these lines, the root cause is not fully established and we are still investigating. A dedicated team is on site at 6.30am every day, until complete resolution. ### Impacted Services - APIs high response time (may cause timeouts on connected devices or apps) - Content delivery - Non cached thumbnails delivery - In some cases, publication process bad behavior (wrong catalog cache invalidation when a new content is ready) ### Resolution We first wanted to minimize the number of DB queries by increasing the query cache of our APIs. We spawn for that two more instances of our frontend servers. We then decide to also change our load balancer configuration to queue and bufferize more queries depending of the load of our frontends. These two actions solves the problem and the DB locks finally falled to a normal value. Time to repair = 4-5 hours. Each down time. ### Still running actions Traffic is constantly growing every day due to global increase of digital readings but not in any unexpected load, so this problem is not due to a "special load. We still have to work on this problem to understand what exactly happened. Plan of actions : - reduce the coupling between some parts of services and our main DB - increase caches or add more application cache - increase the number of DB probes to graph more values - working on our connexion spooler to better spread and fallback connections

resolved

This incident has been resolved.

monitoring

Response time is stable. Continuing monitoring.

monitoring

Fix deployed. Waiting for results.

investigating

We are currently investigating this issue.

Report: "CDN partial outage"

Last update
resolved

This incident is now resolved. The root cause is a limited network outage which affected several key components of our CDN servers.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "api shortage"

Last update
resolved

A faulty switch disrupted connections between our api server and our main message broker.

investigating

We are experiencing a shorage on the core api, we are investigating.

Report: "Content delivery Issue"

Last update
resolved

We resolved a problem on our CDN stack occurring under very high load.

investigating

We notice important trouble in content delivery in FR. Currently investigating.

Report: "API response time slowdown"

Last update
resolved

API core servers experienced a major slowdown when an external service failed. The incident occured between 8 and 9 am GMT +1

Report: "Our hoster is experiencing networking issues."

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Critical service failure affecting stats and publication."

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Major outage"

Last update
resolved

Our hoster resolved the networking outage. Critical services are up and we are monitoring the situation for unexecpted problems that might occur.

identified

Our hosting providers are experiencing a major network/routing outage in multiple datacenters.

Report: "api core shortage"

Last update
resolved

We attributed dedicated resources to the functionalities requiring external services, thus isolating them from the rest of the api core. We believe this will definitely solve this issue.

monitoring

We have identified at least 2 external services with extremely slow response time. Requests requiring these services monopilized all the resources and caused a massive shortage of the api core services. We deployed a fix to isolate the faulty external services and are monitoring the situation.

investigating

We are currently investigating this issue.

Report: "hosting incident affecting kiosks servers."

Last update
resolved

This incident has been resolved.

identified

Our hosting contractor is experiencing network issues and our kiosks servers have lost internet connectivity.

Report: "Faulty CDN"

Last update
resolved

We experienced an important packet loss on critical network links between some of our CDN. Traffic has been rerouted and connectivity is nominal again.

investigating

We have isolated a faulty CDN experiencing network connectivity loss.

Report: "Partial delivery of content - Downloads"

Last update
resolved

We corrected a problem on our main delivery cache server.

identified

The issue has been identified and a fix is being implemented.

Report: "Major network problem with our hosting contractor"

Last update
resolved

This incident has been resolved.

identified

One of our load balancers is still experiencing connectivity issues. We are working on it.

monitoring

Connectivity to the server is restored and services are in cold start state.

investigating

We are currently investigating this issue.

Report: "core api degraded performance"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "core api overload"

Last update
resolved

This incident has been resolved.

investigating

degraded performance, we are investigating.

Report: "API response time slowdown"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

heavy load on several critical components.

Report: "Cache invalidation latency"

Last update
resolved

This incident has been resolved.

identified

We have some cache invalidation latency. New publication may take some time to be available throught catalog apis

Report: "Cache invalidation"

Last update
resolved

This incident has been resolved.

monitoring

Problem fixed. Due to a wrong version of a third library which have ben updated by mistake in one of our deployment. Monitoring all night publications to be sure the issue is fixed.

investigating

We have some delays in cache invalidation. This can affect new publication in point of sale catalogs

Report: "core api outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "Hardware failure on a core component"

Last update
resolved

Monitoring ok

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "API response time"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "FTP server fault"

Last update
resolved

This incident has been resolved. Server has been restarted Conversion files delays has been solved at 2:30.

investigating

Console & FTP system requests to the server fail. There is a conversion queue overflow.

Report: "DB Locks"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

A lots of locks in our main database impact our service response time.

Report: "Global response time"

Last update
resolved

This incident has been resolved.

investigating

We notice an anormal response time from ou internal & public APIs. Currently investigating if this problem is linked to the previous one.

Report: "Publication Delays"

Last update
resolved

This incident has been resolved.

identified

Our publication queue reached a critical number of tasks. Currently spawning more servers in the publication pool.

Report: "Gateway outage"

Last update
resolved

This incident has been resolved.

investigating

We have a network issue with one of our gateway. Our internal publication API is impacted.

Report: "API Partial Outage"

Last update
resolved

Due to sub-optimal data access requests.

Report: "DB locks"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.