Elium

Is Elium Down Right Now? Check if there is a current outage ongoing.

Elium is currently Operational

Last checked from Elium's official status page

Historical record of incidents for Elium

Report: "Corrupt/unavailable file upload"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We've rollback the affected version, upload are working again, we are monitoring the issue and assessing the corrupted files

investigating

Our File provider is experiencing file corruption on upload, since Feb 04, 2025 - 14:30 CET all upload are now blocked to avoid issues, we are currently investigating possible fixes

Report: "DNS issues on our private Cloud provider (Outscale)"

Last update
resolved

Our private cloud provider has now told us that the incident is closed on their side. All services seem to be working correctly again.

monitoring

As of 12:20, we are no longer detecting any errors on the services hosted by our private cloud provider. However, we have not received confirmation from them that the incident has been resolved. We are therefore continuing to monitor the platforms.

identified

We received a feedback from our private Cloud provider : "Nous avons un incident au niveau du réseau interne de notre orchestrateur. C'est en cours de résolution au niveau des équipes techniques. Nous reviendrons vers vous dès que le problème est résolu."

identified

The issue has been identified and a fix is being implemented.

investigating

Our private Cloud provider has informed us that they are currently experiencing problems with their DNS service. This is having an impact on certain features of the solution. In particular, services requiring external access, such as integrations with services like Google, Facebook, etc., as well as file storage and sending and receiving emails, are currently experiencing intermittent problems.

Report: "Issue on private (Outscale) hosting"

Last update
resolved

We identified the root cause and fixed it. The platforms on the private (Outscale) hosting are now completely available again.

identified

The platforms on the private (Outscale) hosting are intermittently unavailable. We continue to investigate the root cause of this incident and try to permanently fix the issue.

identified

One of the servers hosting private (Outscale) platforms has an unexpected increase of memory load. We switched traffic to another backup servers. We still investigate the root cause of this incident.

Report: "Issue on private (Outscale) hosting"

Last update
resolved

We deployed a fix to ensure that this problem will not occurs in the future.

monitoring

Finally, the services of our private Cloud provider are back. We still monitoring and investigate to avoir that the problem will reproduce.

investigating

We are detecting problems with connections to platforms hosted by our private cloud (Outscale). We investigate the problem.

Report: "Issue on private (Outscale) hosting"

Last update
resolved

This incident has been resolved.

monitoring

Every platforms hosted on our private cloud (Outscale) have been restored.

identified

Services hosted on private Cloud (Outscale) are now restored and platforms are available but storage system seems to stil broken and files still unavailable. We are awaiting further information.

identified

Our cloud provider (Outscale) notify that a fix has been implemented and that they still monitoring the results. But we still have issues with files.

investigating

The connection to services and virtual machines seem to be accessible again. Storage is still not working, files are not accessible.

investigating

Our Cloud provider (Outscale) added the problem to their persistant storage, public network and API. We are awaiting further information.

investigating

Our Cloud provider (Outscale) reported a problem with their internal network and data hosting. We are awaiting further information.

investigating

Some virtual machines hosted in our private cloud (Outscale) are not responding. We have contacted our cloud provider and are awaiting further information.

investigating

We are continuing to investigate this issue.

investigating

We are detecting problems with connections to platforms hosted by our private cloud (Outscale). We investigate the problem.

Report: "Unavailable Files on Private Hosting"

Last update
resolved

This incident has been resolved.

monitoring

At around 3:30pm, our Cloud provider indicated that the incident was over and that the system was stable. For our part, we have not detected any errors since 2:20 pm. We continue to monitor the service.

identified

This Friday morning at around 8:30, we detected an increase in errors accessing files hosted on certain platforms. These errors are due to an incident currently being experienced by our Cloud provider hosting the files. This is making the upload and download of certain files temporarily unavailable. We are monitoring the progress of this incident with our supplier.

identified

We have detected an abnormal number of errors (502) on requests to our Storage provider (for private hosting). For the moment, there is no critical impact on platforms (only few requests are in errors). Our Storage provider has reported an outage on its status page.

Report: "Unavailable Files"

Last update
resolved

The problem was identified and actions were taken to correct it. We have no longer observed 502 errors since 7:00 p.m. However, we are keeping the matter under close surveillance.

identified

We have detected an abnormal number of errors (502) on requests to our Storage provider (for private hosting). For the moment, there is no critical impact on platforms (only few requests are in errors). Our Storage provider has not reported any outage on its status page.

Report: "Unavailable Files"

Last update
resolved

There don't seem to be any more errors (502) since 11:52

identified

We have detected an abnormal number of errors (502) on requests to our Storage provider (for private hosting). For the moment, there is no critical impact on platforms (only few requests are in errors). Our Storage provider has not reported any outage on its status page.

Report: "Unavailable Files"

Last update
resolved

This incident has been resolved.

monitoring

There don't seem to be any more errors (502) since 12:15.

identified

We have detected an abnormal number of errors (502) on requests to our Storage provider (for private hosting). For the moment, there is no critical impact on platforms (only few requests are in errors). Our Storage provider has not reported any outage on its status page.

Report: "Unavailable Files"

Last update
resolved

This incident has been resolved.

monitoring

Our upstream provider informed us that the maintenance was completed and the situation is being monitored. We no longer observe any errors coming from the storage system at this point.

identified

Our upstream provider informed us that the recovery has been delayed, and should now be no earlier than 15h30

identified

Our storage provider informed us of an unplanned operation on the file storage service, the service will be unavailable at least until 14h while it is being worked on. Data integrity is not affected. We are in close phone contact with our provider and will be updating the status of the operation. During this time, all file uploads and downloads will not work

identified

The issue has been identified by our provider, we are waiting for a fix on their side

Report: "File access issues"

Last update
resolved

This incident has been resolved.

identified

Upload and thumbnail is affected. The issue is located with the file storage provider.

Report: "unavailable instance"

Last update
resolved

This incident has been resolved.

monitoring

Situation seems to be back, we are still investigating and monitoring the situation

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Increased issues to access files on private hosting"

Last update
resolved

This incident has been resolved by the provider. We will continue to monitor the system.

monitoring

As Outscale has marked the incident as fixed, we continue to monitor the situation

identified

A temporary service interruption is currently impacting our file storage functionality. Unfortunately, our Cloud Storage provider is experiencing an issue, which is affecting the download and upload of files within our system. Our technical team is in close communication with their support team to identify the root cause and implement a solution swiftly. During this period, end-users may encounter difficulties in accessing or uploading files. We sincerely apologize for any inconvenience this may cause. Our team will provide regular updates on the incident, including any estimated timelines for resolution or any workarounds available in the meantime.

investigating

Some of the files are inaccessible: pdf, thumbnails, ...

Report: "Unavailable frontend"

Last update
resolved

We switched back to the original hosting for static resources (Javascript, CSS). We still monitoring for errors.

monitoring

We identified random issues to serve static resources (Javascript, CSS) for some platforms and for specific end-users. We attached these issues to the incident at one of Google's data centers this morning that is impacting the global CDN that serves these static resources. We have implemented a temporary work-around to fix the issue while waiting for Google to solve the root cause. Static resources were deployed on another Cloud provider (Amazon) and used from there.

investigating

We are continuing to investigate this issue.

investigating

We are investigating the issue

Report: "File storage unavailable"

Last update
resolved

Our upstream provider reports a fix has been deployed

monitoring

Since 12h05 we no longer see errors from the storage provider. We are now waiting for a confirmation the problem is fixed

identified

The issue has been acknowledged by our provider and they are working on a solution

investigating

We detected an elevated error rate when using our private hosting provider file storage

Report: "File storage unavailability"

Last update
resolved

Our provider confirms the fix is operational.

monitoring

We are no longer seeing errors from our provider. We are waiting for their confirmation that the problem is solved on their end

identified

Our upstream provider has updated its status page: https://status.outscale.com/ We are investigating the time to resolution with them

investigating

We are continuing to investigate this issue.

investigating

We are currently noticing an increased error rate on file storage operations on our private hosting provider.

Report: "Storage system is unavailable"

Last update
resolved

Our provider informs us that the issue is now resolved

identified

More information can be found here: https://status.outscale.com/

investigating

Our storage provider is reporting an outage with the system, this makes images and file upload unavailable for the moment

Report: "3DS Outscale Objects Storage issue"

Last update
resolved

This incident has been resolved.

monitoring

Incident has been fixed by Outscale at 19:10.

identified

We are continuing to work on a fix for this issue.

identified

A fix is being implemented by 3DS Outscale, and an emergency maintenance is ongoing on the 3DS Outscale Objects Storage service.

Report: "Incident with indexing system."

Last update
resolved

This morning, we detected a problem with our system for indexing platform content. It was no longer working properly and some content was temporary not visible on the platforms. We have already corrected the problem and restarted the indexing process. This incident also has an impact on other platform features, such as mentions and search.

Report: "Partial service unavailability"

Last update
resolved

A random portion of requests made during the affected period failed with a 404 or 500 error. This only affected a small part of our clients running in our Google Datacenter Requests started failing at 17h00 and service was restored at 17h20

Report: "Service unavailability"

Last update
resolved

This issue is resolved

monitoring

This incident will be closed when the maintenance window from our provider closes at 14:00 CET. No further interruption should happen until then

monitoring

Service has been restored upstream

identified

Our infrastructure provider is performing an urgent maintenance on their network causing temporary drops in connectivity

investigating

We detected an interruption in network connectivity on our private datacenter causing instances to be unresponsive

investigating

We are currently investigating this issue.

Report: "Load on one of our Outscale K8S cluster node"

Last update
resolved

We performed several tests (including the deployment of a new version of the Elium services) to validate that the new node is stable.

monitoring

We have created a new node using different hardware specifications (CPU type). After several tests, we found that the abnormal load problem no longer occurs on this type of machine. We continue to monitor the behaviour of this node. At the same time, we are reporting our findings to 3DS Outscale support in order to validate that the problem comes from the type of machine used for this node.

identified

We still testing different configurations for the faulty node (different kernel version, create another node).

identified

We completely recreated the node and redeployed the services. The load continues to increase abnormally and this impacts the customer instances. We have therefore, once again, disabled the services on this node.

identified

We are trying to solve the node load problem. This creates slowness on the instances of clients hosted on our private hosting (Outscale) when the services restart on the node.

identified

During rolling updates, restarting containers on the node produces timeouts

investigating

Restarting the node solved the load problem. We are still checking why this load occurred. Currently, the services are working properly again.

investigating

We have detected an abnormal load on one of the nodes of our Outscale kubernetes cluster. We had to restart it.

Report: "3DS Outscale issue"

Last update
resolved

New network configuration has been applied.

monitoring

We have reverted to the previous network configuration and are monitoring the behaviour. The new network configuration will be tested again after further impact analysis.

identified

A maintenance on the network was the root cause of the issue. We restored the previous configuration and continue our investigation.

investigating

We have an issue involving 3DS Outscale hosting. We are currently investigating it.

Report: "Memory overload on services storage system."

Last update
resolved

Our service storage system is unavailable due to an overload. The platforms are currently inaccessible.

Report: "Queue processing issues"

Last update
resolved

This incident has been resolved.

monitoring

A larger queue processor is now live, some delays are expected while the queue is being processed

identified

Our queue processor went out of capacity, a larger one is being provisioned

investigating

We are investigating an issue in processing background tasks

Report: "Service Unavailable for private hosting"

Last update
resolved

Starting at 11:46 until 11:55, service for our private hosting customers was unavailable due to wrongly configured background task. The background task took too much resources from the database, resulting in web service not being able to reach the database and failing to respond. Once identified the background task was cancelled and will be scheduled later with better resources management.

Report: "Bug in the production frontend version"

Last update
resolved

A new version 1.67.10 that fix bug in release 1.67.9 has been released in production.

monitoring

We already reverted to the previous release 1.67.8 in production.

investigating

A undetected bug has been deployed in production release 1.67.9 of the frontend. We will revert to release 1.67.8 as soon as possible.

Report: "Storage system outage"

Last update
resolved

System is fully operational

monitoring

The storage system is now up and running and service should be resumed. We are still seeing some errors for thumbnails serving

identified

Memory issue is repaired and storage system is rebooting

identified

Our storage system is experiencing a memory issue and is affecting general availability of the service

investigating

We are currently investigating this issue.

Report: "Loss of internet connectivity"

Last update
postmortem

Vendredi 11/12/2020 – 14 :25 : remontée d’une alarme backbone concernant le switch B19B4530WIN0 et qq autres équipements situés en aval Vendredi 11/12/2020 – 14 :30 : basic troubleshooting – panne electrique supposée Vendredi 11/12/2020 – 15 :10 : arrivée ingénieur au WDC – qq tests effectués sur l’alimentation et les ventilateurs du B19B4530WIN0 Vendredi 11/12/2020 – 15 :20 : tests non probants – nous decidons de remplacer le chassis du B19B4530WIN0. Le B19B4530WIN0 est constitué de 2 chassis en stack et le chassis défectueux est identifié comme étant le C3750-X – disponible en spare backbone au stock à Wierde. Vendredi 11/12/2020 – 15 :30 : sortie du CAT3750-X spare du stock et transfert jusque WDC Vendredi 11/12/2020 – 15 :30 – 16 :15 : détricotage et reperage des connexions UTP se terminant sur le B19B4530WIN0 pour preparer la migration Vendredi 11/12/2020 – 16 :10 : arrivée du switch spare au WDC. Vendredi 11/12/2020 – 16 :15 : configuration du switch spare. Vendredi 11/12/2020 – 16 :40 : remplacement du switch défectueux. Vendredi 11/12/2020 – 17 :00 : formation du stack entre les 2 membres du switch et début du replacement des cables UTP Vendredi 11/12/2020 – 17 :04 : reboot du switch pour configuration du system MTU. Vendredi 11/12/2020 – 17 :07 : fin du replacement des connexions UTP sur le switch spare. Vendredi 11/12/2020 – 17 :07 : fin de l’intervention   ROOT-CAUSE : panne hardware du chassis B19B4530WIN0

resolved

The upstream provider connectivity has been resumed in our datacenter

monitoring

We identified another issue related to serving of thumbnail/file contents that should be resolved as soon as the new DNS record propagates

monitoring

Our internal DNS resolver was still set to the failing primary internet line, and has been switched to use our backup line DNS provider

monitoring

We are having DNS issues on some of our private hosting facility since the upstream switch

monitoring

The datacenter has confirmed they have a problem with one of their internet provider, our backup provider is unaffected

identified

We had to update our DNS records to point to our backup external IP addresses, depending on the cached value, this might take some minutes to propagate

investigating

We switched our internet connectivity to our backup provider

investigating

Instances hosted in our private hosting facility are unreachable because our internet connectivity is down

investigating

We are currently investigating this issue.

Report: "Loss of connectivity"

Last update
resolved

Connectivity is restored

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Connectivity issues"

Last update
resolved

We have not detected any remaining connectivity issues.

monitoring

The correct routing configuration is now deployed, and service is stable

identified

Previous configuration has been deployed, serving of requests is resumed. There might be some failing requests while the new configuration is corrected and deployed

identified

We identified a routing issue in our private hosting facility, and restored a previous working configuration

investigating

We detected issues serving requests on our private hosting facility, service may be unavailable

investigating

We are currently investigating this issue.

Report: "Update memory allocated to distributed storage system."

Last update
resolved

All systems have been updated.

monitoring

More memory were allocated to the system.

identified

We are continuing to work on a fix for this issue.

identified

We detected memory pressure on one of our systems part of our distributed storage. We allocated more memory and rebooted this system.

Report: "Degraded performances"

Last update
resolved

This incident has been resolved.

identified

We are currently experiencing degraded performances due to a high load on our background tasks.