Is 7digital Down Right Now? Discover if there is an ongoing service outage.

7digital is currently Operational

Last checked Jul 29, 2025 14:48 UTC from 7digital's official status page

Historical record of incidents for 7digital

Apr 4, 2025

Report: "Platform Incident - Search Outage"

Last update 2025-04-04T15:00:31.674Z

resolved2025-04-04T15:00:31.638Z

This incident has been resolved.

monitoring2025-04-04T14:33:50.308Z

A fix has been implemented to restore service to search and we're continuing to monitor the issue. Regards, 7digital Client Success Team

identified2025-04-04T13:56:19.172Z

We're currently experiencing an outage with our Catalogue Search. This impacts users of the search API and the 7digital playlist tool with search currently unavailable within the playlist tool. The 7digital engineering team are working to resolve the issue and we'll provide further updates as soon we have them. Regards, 7digital Client Success Team

Aug 21, 2024

Report: "Ingestion outage impacting Catalogue Events"

Last update 2024-08-21T13:36:53.311Z

resolved2024-08-21T13:36:53.292Z

Catalogue ingestion has now been switched on and the platform has been restored to full health. The team will continue to monitor and be on hand to react to any further incidents.

identified2024-08-21T12:01:35.430Z

Our 3rd party supplier has updated us that they're seeing significant signs of recovery to their service. As a result some of the impacted services on 7digital's platform are also starting to restore. We'll continue to monitor progress and have taken the decision to leave catalogue ingestion switched off until we're confident the platform is restored to full health. At this stage we plan to switch on catalogue ingestion shortly after 13:00 UTC if things continue to progress as expected. We'll update this incident at that time.

identified2024-08-21T10:12:22.229Z

On further investigation we've identified that an issue with a 3rd party supplier is impacting our ingestion platform. Despite attempting to implement a resolution internally, this has resulted in an outage impacting our ability to ingest content from suppliers. All catalogue updates will be delayed until our ingestion platform is restored to full health. To clarify after the initial status message, our Catalogue Events product is operational but due to the underlying issue with our ingestion platform we won't be generating any catalogue updates through Events as a result.

identified2024-08-21T09:26:47.982Z

As of ~08:00 BST we started to observe errors in our Catalogue Events product which meant we were unable to push out any event messages. We've identified the issue and are implementing a fix to resolve the impact on Events. Once the service is restored we'll follow up with further updates and information.

Mar 4, 2024

Report: "Data Warehouse Processing Delay Impacting Incremental Artist Feed"

Last update 2024-03-04T17:06:42.587Z

resolved2024-03-04T17:06:36.446Z

This incident has been resolved.

identified2024-03-04T16:27:30.546Z

Dear Clients, Due to unforeseen circumstances with our data warehouse, the incremental artist feed for 20240304 has been generated, however, it will not contain any data. To ensure you have access to the most up-to-date catalogue, we advise continuing as normal by ingesting the incremental artist feed for 20240305, which will cover data for both 20240304 and 20240305. We apologise for any inconvenience this may cause and want to assure you that we are actively working to resolve this issue and prevent future occurrences. If you have any questions, please contact your Client Success Manager. If you are a client who does not ingest our feeds then please ignore this email. With best regards, 7digital Client Success Team

Dec 25, 2023

Report: "Platform Incident - Elevated Streaming Errors"

Last update 2023-12-25T22:46:47.658Z

resolved2023-12-25T22:46:47.644Z

This incident has now been resolved. We'll continue to monitor the platform and we'll continue to have engineering support staff on hand should any further issue occur.

monitoring2023-12-25T22:24:43.102Z

A fix has been implemented to deal with the elevated error rate and we're now seeing error rates returning to a normal level. We're continuing to monitor the platform and will take any further action as necessary.

identified2023-12-25T21:26:07.565Z

We're observing an elevated error rate against stream requests to the platform. The 7digital engineering team are working to resolve the issue and we'll provide further updates as we have them.

Oct 5, 2023

Report: "Platform unavailable due to unprecedented demand"

Last update 2023-10-05T12:18:35.312Z

resolved2023-10-05T12:18:35.292Z

This incident has been resolved.

monitoring2023-10-05T11:22:40.177Z

A fix has been implemented and we are monitoring the results.

investigating2023-10-05T09:34:07.894Z

We're currently observing an unprecedented volume of demand on the platform, and as a result multiple areas of the API are currently unavailable. We're in the process of scaling up the platform to support this increased load and restore availability. As soon as we have further updates they'll be posted here.

May 15, 2023

Report: "Partial Outage: Elevated Error Rates"

Last update 2023-05-15T18:37:21.905Z

resolved2023-05-15T16:08:26.575Z

This incident has been resolved.

monitoring2023-05-15T15:38:24.401Z

Dear clients, The platform degradation/outage experienced today is now closed. The platform is stable and the error rate has dropped. Our Tech team will continue to investigate. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

investigating2023-05-15T15:08:02.829Z

Dear Clients, During this time, we have noticed a higher than usual error rate on calls to various APIs and longer response times. Our team of on-call support engineers is currently investigating the issue and taking steps to improve the stability of the platform. We will provide you with additional details in a further notice once we have completed our analysis of the problem. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

Apr 26, 2023

Report: "Platform Incident - Ingestion Outage"

Last update 2023-04-26T09:02:25.893Z

resolved2023-04-26T09:02:25.880Z

This incident has been resolved.

monitoring2023-04-25T14:35:41.275Z

A fix has been implemented to solve the problem, allowing us to resume ingestion. The backlog will be cleared gradually as we ramp up processing capacity.

identified2023-04-24T15:24:52.685Z

We've identified the fault and are currently in the process of restoring our ingestion platform. Once the ingestion platform is functional again we can begin ingesting pending deliveries and will provide a further update at that time.

identified2023-04-24T11:29:55.211Z

We are currently experiencing an outage on our content ingestion systems. We last processed new content and updates on Saturday morning (UK time). We are working towards a resolution now and plan to provide an update shortly.

Mar 23, 2023

Report: "Platform Incident - Download API Outage"

Last update 2023-03-23T01:14:56.904Z

resolved2023-03-23T01:14:56.890Z

We've rolled back our Download API to a previous state which has restored downloads successfully. We'll continue to monitor to ensure the issue is resolved and will follow up with further information once we've compiled a post incident report.

investigating2023-03-23T00:33:03.730Z

We're currently investigating an issue with the download API where most, or all download attempts are failing. As soon as we have further information we'll provide additional updates.

Oct 17, 2022

Report: "Platform Degradation"

Last update 2022-10-17T12:22:51.340Z

resolved2022-10-17T12:22:51.327Z

This incident has been resolved.

monitoring2022-10-17T11:43:14.098Z

Dear clients, The platform degradation. The platform is stable and the error rate has now returned to normal performance levels. Our Tech team will continue to investigate and monitor performance. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

investigating2022-10-17T11:34:45.904Z

Dear Clients, We are currently seeing a degradation of service across most APIs. This issue is only affecting uncached media as cache hits are served from our CDN and are therefore unaffected. Our on call support engineers are currently investigating the issue and taking action to further stabilise the platform. Once we have completed analysis of the issue we'll send an additional notice out with further details. Further notices will be sent regarding this incident as they become available. With best regards, 7digital Client SuccessTeam

Aug 25, 2022

Report: "Network failure incident"

Last update 2022-08-25T13:57:57.153Z

resolved2022-08-25T12:46:00.000Z

At 13:46 BST we were alerted to an issue with one of our two network lines. After investigation, at 14:04 BST we removed the network at fault from service, directing all traffic to our secondary line which restored normal service to the platform. We are continuing to investigate what caused the failure and will restore network redundancy as soon as possible.

Jun 24, 2022

Report: "Playlist Tool Unavailable"

Last update 2022-06-24T11:01:16.149Z

resolved2022-06-24T11:01:16.135Z

This incident has been resolved.

monitoring2022-06-24T07:34:51.133Z

A fix has been implemented and we are monitoring the results.

identified2022-06-24T06:59:12.262Z

Dear clients, The 7digital Playlist Tool is currently unavailable. Once we have further updates we'll share them with additional announcements. You can subscribe to updates via email, webhooks and RSS feed on our statuspage (insert link). If you would like to receive SMS updates, please create a Service Desk ticket with the Client Success Team. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

May 10, 2022

Report: "Platform Outage"

Last update 2022-05-10T15:29:28.191Z

postmortem2022-05-10T15:20:18.182Z

The incident report for this outage is now available and can be found [here](https://drive.google.com/file/d/1MK4JVxFuKtHnnvekUzSk2aT61Z0zo43G/view).

resolved2022-04-28T12:55:32.675Z

Full service has been restored to all components of the platform and monitoring will continue. We're confident service has been restored and we will follow up with an incident report next week once we've been able to fully evaluate the issue and the actions taken to remedy it.

monitoring2022-04-28T12:29:21.386Z

As per the previous update a fix has been implemented and service has been restored to the platform, with the exception of ALC purchasing, locker and ALC/permanent download endpoints. We are continuing to monitor the fix and will also be working on restoring high-availability to the platform.

identified2022-04-28T11:52:52.466Z

We are continuing to work to resolve the ongoing outage. The cause has been isolated to our SQL Server cluster, which is currently not able to keep certain databases online - choosing to take them down for as of yet, an unknown reason. Our efforts to force a single node to host the databases have so far not been effective at solving the current issue. We believe at this stage, that our cluster configuration has a non-trivial problem, and we are moving to bring up databases separately outside of the high-availability cluster to restore service as soon as we can. Following the return of stability, we will look to restore high-availability as soon as we can.

identified2022-04-28T08:42:03.945Z

We are continuing to work on a fix for the issue and will provide an update again shortly. We can also confirm that cached streams continue to be served throughout the incident. We're reviewing activity to identify other areas of the platform which are only partially unavailable and will communicate updates with further information shortly.

identified2022-04-28T05:49:04.413Z

We have been able to successfully bring back online the affected DB cluster, however we're still experiencing problems keeping the DB cluster online permanently, causing the platform to be unavailable. We are continuing to investigate and will provide further updates shortly.

identified2022-04-28T04:34:04.656Z

We've now isolated the cause of the outage to our DB cluster. We're observing an issue between the primary and replica node which is hindering our ability to bring the cluster back online. We're continuing to investigate the issue and will provide further updates as they're available.

investigating2022-04-28T03:39:43.311Z

We are currently investigating a suspected platform outage. Currently all endpoints on 7digital's API are unavailable, 7digital engineers are investigating the cause and we'll provide further information as soon as it's available.

Apr 23, 2022

Report: "Download API Outage"

Last update 2022-04-23T17:51:38.443Z

resolved2022-04-23T17:51:38.427Z

This incident has been resolved.

monitoring2022-04-23T17:51:33.874Z

The download outage experienced today is now closed. The endpoint is back to normal. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

investigating2022-04-23T17:47:27.881Z

Dear Clients, The Download APIs are currently down due to an outage. Streaming and Media Transfer endpoints are not affected. A new EC2 instance is being set up by the Tech team. Further updates will be announced as soon as they become available. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

Feb 26, 2022

Report: "Platform Degradation"

Last update 2022-02-26T10:53:15.039Z

resolved2022-02-26T10:53:15.024Z

As of 10:50 GMT full service has resumed but we're continuing to monitor the platform.

identified2022-02-26T10:32:46.902Z

Dear Clients, Since 9.20am GMT we saw a degradation across all APIs. Engineers on-call are currently investigating the issue and are taking action to further stabilise the platform. A further update will be communicated in due course. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

Oct 28, 2021

Report: "Partial Ingestion Issue"

Last update 2021-10-28T16:57:42.469Z

resolved2021-10-28T16:57:42.453Z

This incident has been resolved.

identified2021-10-28T11:35:08.577Z

This week we released a fix for an ingestion issue which had been causing some releases to fail ingestion. As a consequence of this fix some content is now being moved from our delivery platform too quickly, resulting in partial or no ingestion of affected releases. This is only affecting suppliers on automated ingestion. Our team are aware of the issue and are currently working on a resolution. Once the issue has been resolved we'll re-ingest all affected releases. Until this issue has been confirmed as resolved please raise tickets for any priority content released on Friday October 29th which is not already available and is delivered by a supplier on automated ingestion. Priority tickets can be raised here: https://7digitalops.atlassian.net/servicedesk/customer/portal/6/group/12/create/209

Jul 19, 2021

Report: "UMG Ingestion Delays"

Last update 2021-07-19T15:10:14.935Z

resolved2021-07-19T13:00:00.000Z

Dear Clients, We are currently experiencing a delay with processing UMG content. A bug was detected in the UMG Teleporter application causing a backlog of content to develop on their side. Therefore, as a result, content will be delayed from entering the daily feeds. A fix has been deployed and UMG have been notified of the issue. We are currently processing the backlog which is estimated to take 24 hours. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

Jul 3, 2021

Report: "UMG Feed Delay"

Last update 2021-07-03T07:43:47.016Z

resolved2021-07-03T07:43:46.999Z

This incident has been resolved.

monitoring2021-07-02T14:43:38.074Z

A fix has been deployed and UMG have been notified of the issue. We are currently processing the backlog.

identified2021-07-02T14:27:04.199Z

We are currently experiencing a delay with processing UMG content. At 8pm 01/07 a bug was detected in the UMG Teleporter application causing a backlog of content to develop on their side. Therefore as a result, content delivered between 8pm 01/07 and 11am 02/07 may be delayed from entering the daily feeds. A fix has been deployed and UMG have been notified of the issue. We are currently processing the backlog.

Jun 8, 2021

Report: "Outage - CDN provider"

Last update 2021-06-08T13:08:00.023Z

resolved2021-06-08T13:08:00.008Z

Dear Clients, The CDN outage experienced today is now closed. The platform is back to normal and the error rate has completely dropped since 13:37. With best regards, 7digital Client Success Team

monitoring2021-06-08T11:00:22.231Z

The error rate has dropped since 11:53 GMT. Our Tech team will continue to monitor the platform before we close this incident. For the most current updates, you can follow Fastly Status page here: https://status.fastly.com/

identified2021-06-08T10:33:29.101Z

We are continuing to work on a fix for this issue.

identified2021-06-08T10:31:29.142Z

The CDN outage is affecting all 7digital API and Media Delivery endpoints. Our on-call support engineers are currently investigating the issue and in contact with our CDN provider. You can subscribe to updates via email, webhooks and RSS feed on our statuspage (https://status.7digital.com/). If you would like to receive SMS updates, please create a Service Desk ticket with the Client Success Team. Once we have further updates we'll share them with additional announcements. If you have any questions, please create a Service Desk ticket with the Client Success Team.

investigating2021-06-08T10:15:26.285Z

Dear Clients, We are currently seeing a degradation of service with our CDN provider. Further notices will be sent regarding this incident as they become available. With best regards, 7digital Client SuccessTeam

Apr 23, 2021

Report: "Playlist Tool Unavailable"

Last update 2021-04-23T15:40:40.852Z

resolved2021-04-23T15:40:40.840Z

This incident has been resolved. Playlist tool is now stable

investigating2021-04-23T15:35:13.516Z

We are continuing to investigate this issue.

investigating2021-04-23T15:28:09.860Z

We are currently investigating the issue. The Playlist API is not affected.

Jan 12, 2021

Report: "Elevated Error Rates - CDN"

Last update 2021-01-12T16:28:24.355Z

resolved2021-01-12T16:28:24.334Z

This incident has been resolved.

monitoring2021-01-12T16:26:01.364Z

Between 15:17 and 16:04 GMT, our CDN provider encountered an issue causing an increase in error rates. As of 16:04 GMT normal service has been resumed but we're continuing to monitor the platform.

Dec 21, 2020

Report: "UMG Global Outage"

Last update 2020-12-21T14:18:23.353Z

resolved2020-12-21T14:18:23.341Z

This incident has been resolved.

monitoring2020-12-21T12:55:49.000Z

UMG have now fixed the issue and we are receiving content once again. Whilst the issue has been resolved, we are expecting a 4 day turnaround to clear the backlog that has developed. As such, any releases from UMG Global from 18/12 may not be in feeds until this coming Friday 25/12. However, while we clear this backlog, we can now look to action priority release requests from UMG where possible.

identified2020-12-21T11:01:53.727Z

We are currently experiencing an issue with UMG Global's content delivery platform impacting UMG content from the 18/12/20 onwards. Currently, the backlog of deliveries includes updates, takedowns and inserts. As a result of this backlog on UMGs servers, we are unable to action priority release requests from UMG at this time The issue sits with UMG Global's delivery system and is unrelated to 7digital's content ingestion process. We are currently working on a fix with UMG. We apologies for any inconvenience and will keep you updated as we progress with a fix.

Jul 17, 2020

Report: "Platform Degradation"

Last update 2020-07-17T14:19:22.371Z

postmortem2020-07-17T14:08:19.299Z

Our CDN provider \(Fastly\) has now confirmed that they have identified an issue with their service which directly caused the degradation see in this incident. A further explanation of the issue which occurred at our CDN provider is below: Starting at approximately 08:51 UTC, Fastly observed three global transit provider events affecting most Fastly data centers, leading to increased 5xx errors and latency. The first event occurred from 08:51 to 8:55 UTC, the second from 09:57 to 10:04 UTC, and a third event starting at 10:17 UTC. During the third event, at 10:21 UTC, a Fastly traffic engineering configuration change, designed to mitigate the provider issues, inadvertently led to additional customer impact in the form of errors, latency, and timeouts. Additionally, the Fastly portal and API were affected during this event. Fastly Engineering started mitigations at 10:25 UTC, gradually restoring services until the last repair was completed at 10:56 UTC.

resolved2020-07-14T10:00:00.000Z

Between 11:22 and 12:00 BST we saw a degradation of service on the platform. As of 12:00 BST full service has resumed but we're continuing to monitor the platform and will follow up to provide further details on the impact of the incident and the measures taken to resolve and prevent it reoccurring.

Mar 24, 2020

Report: "Platform Outage"

Last update 2020-03-24T15:53:25.305Z

postmortem2020-03-24T14:20:50.077Z

**7digital Incident Report** ### **Incident Details:** **Incident Summary** A switch within the CTR data centre power cycled itself, causing the ILB high-availability cluster to failover. Whilst the ILB failover completed, the automatic failback \(after the switch recovered\) left the ILB and XRP in a state of limbo which only got resolved when keepalived was restarted on all nodes. This caused an almost complete API outage, since most critical APIs rely on the ILB to route API calls. In addition, the cloud catalogue API did not recover as quickly as the data centre services due to it making use of a DNS entry where the automatic failover was disabled. ### **Timeline** 22:42 - On-call SRE receives multiple pingdom down alerts across all APIs 22:45 - SRE online, reports VPN to access platform is online. Identifies the severity of the outage and calls Client Success OOH. 22:52 - API health dashboard shows 100% error rate on most endpoints; large response time \(> 2 seconds\). Core Platform errors dashboard shows an initial load of API Router errors indicating they are timing out whilst connecting to the DB. 22:56 - Client Success indicate they are working on notification to clients. 22:56 - SRE starts working through "data centre failure modes" runbook. 22:59 - DC cross connect identified as being up. 23:02 - ILB IP announcements look OK according to "ip a". SRE notices that the backup ILB briefly recently received some traffic. SRE decides to reset keepalived to force re-announcements of IPs anyway. 23:05 - Most services start to recover; pingdom alerts clear; Core Platform application errors mostly clear apart from webstore & comparison-reproxy. 23:07 - SRE asked by Client Success if the prepared platform announcement should go out since the platform looks to be recovering. Decision taken to send it as stability is not clear. Notification sent to clients. 23:07 - SRE notices that VHC has taken all API traffic and Pingdom is still reporting CTR XRP as down. ~/track/details is also reported to be down. 23:11 - API origin DNS \(which Pingdom check is using\) is found to be pointing to CTR. DNS made easy shows the record's auto-failover mode has been disabled. SRE re-enables the auto-failover. Pingdom alerts recover for all but the CTR XRP check. 23:15 - SRE notices that the release details endpoint still has a high error rate \(50%\) and the 7digital D2C webstore is still erroring in Core Platform application errors. Client Success manually checks ~/release/details and finds that it looks stable. 23:31 - Whilst investigating the issue with ~/release/details, SRE notices that the errors to that endpoint dropped off from 23:26. 23:36 - SRE tells Client Success that the platform looks all UP now, but they are still monitoring and looking into the loss of DC redundancy \(CTR not handling API traffic\). 23:39 - SRE forces 99% of API traffic to VHC whilst investigating CTR issues. 23:41 - Client Success update incident status to "monitoring" 23:43 - SRE restarts NGINX on CTR XLB 00 and 01, has no effect. Restarting keepalive on those hosts however restores CTR's XRP service. Pingdom alert clears. 23:59 - Incident closed. **Duration of outage/incident \(Time to Recovery\):** 25 minutes **Time taken to isolate/diagnose the issue \(Time to Isolate\):** 25 minutes ### **Impact** **What applications or services were affected?** Any partner services and internal applications \(inc. web store\) which use our API. **How might these services have been affected?** Indicators show a complete outage during this time. Error rates of 100% and high response time. ### **Technical Details** It seems like whilst the platform correctly failed over given a presumed network blip, once the blip had resolved itself, the failback did not complete cleanly. This caused the almost-total outage of the API. A smaller, secondary problem with how DNS is updated given the loss of a DC's XRP, caused the cloud catalogue service to fail. This mainly affected 7digital's webstore service, and did not impact partners. **Dashboards:** Core Platform Application Errors: ![](https://lh6.googleusercontent.com/WeEGRD8mG6lATlOP0BNVdPQPyyzWjFi6z813qjbnDIcRZjkuepQzVIg3h92qZIOUEPvJZaA_-7kgWkDbqOsdA-PKtB9_YObUSKwtiMAziCcVj2fLFsp7YnN7JW3U8tu4qeUBY7pq) ‌ API Health dashboard: ![](https://lh4.googleusercontent.com/Z9m1QzuSmZV0OLPN0vHk7vNoQFbJM9uRHV-fPK-0eoBSx0eYyUVxg-MinSo9Ec4Fl4k-pgksZWcLp9K1jE4LbKlHnDkFL_8Js1l-ZSn81R_Bh6eWbUl2aeeoFKzLbQB5UNJWS2L1) Data Centre Usage: ![](https://lh5.googleusercontent.com/7rxr1vLpKpdcp9_zEDF4EdpZcJFSl_iqBs3CuFUebhJau7Pbx-zHheQiD1SZiUpD77S1u5OhaNKjAzHECmPUTBlwyMDKoITs4bRTtXkqolmUBj8kyetwaNmHTtZWCSqamom_9rt-) ‌ **Analysis of our response to rectifying the incident** As is the case with a lot of networking-triggered incidents, the information available to SREs was at first confusing and did not immediately reveal a resolution. However, since the team had witnessed something similar happen in the past, we had a runbook at hand to help SREs diagnose networking & data centre issues. This proved a decisive factor in the relatively quick recovery of the service, given the complex nature of the fault. Process wise, we were quick to identify the impact to customers and client success was able to notify partners as quickly as possible. We have also identified that we could have better documentation on how the cloud catalogue service is architected, so that the SRE team can better understand and recover the service. ### **Analysis of the technical issue/s** Ideally, switch power cycles/failures should be able to happen and our infrastructure automatically recover, or failover. In this instance, the infrastructure did not recover on its own and required SRE intervention to force all load balancers to re-announce their floating IPs to switches. Our investigation will focus on how we can automate the recovery of the service given this scenario, as we've had similar occurrences of this in the past. We're also aware of CTR-TEN-AS1 being a SPOF in relation to the dark fibre, so we will look into ways of increasing redundancy there. With regards to the on-going webstore issues, it has been presumed that the reason that was continuing to fail was its reliance on an DNS entry that had its automatic failover disabled. Since DNSmadeeasy provides no audit trail, we will look at ways of regularly snapshotting the configuration to source control so we can trace changes in future. It is presumed that once the TTL for the bad DNS record had expired, the cloud catalogue infrastructure recovered on its own, hence no intervention was required to fix that issue following the re-enabling of the automatic failover. **Conclusions and Actions** The resultant de-briefing identified the following issues with our process: 1. There are multiple locations which explain our incident response process. We should remove redundant copies of the process so that only one is accessible to avoid confusion. In general we were fairly happy with how quickly we responded, however, as this is the second time the load balancer has not recovered following a quick failback, we will prioritise looking at automating this so that we do not need manual SRE intervention in future. We will also look at updating our documentation on how the new cloud catalogue flow works for the SRE team.

resolved2020-03-22T23:59:54.877Z

Dear clients, The platform outage experienced today is now closed. The platform is back to normal and the error rate has completely dropped since 11:25. Our Tech team will continue to investigate and an incident report will be shared in due course. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

monitoring2020-03-22T23:41:31.343Z

Dear clients, The platform has now been restored and the error rate has dropped since 23:17 GMT. Our Tech team will continue to monitor the platform before we close this incident. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

investigating2020-03-22T23:07:36.372Z

Dear clients, From 22:45 GMT - we are currently experiencing severe platform outage affecting all areas of the 7digital API. Our on-call support engineers are currently investigating the issue and taking action to further stabilise the platform. You can subscribe to updates via email, webhooks and RSS feed on our statuspage (https://status.7digital.com/). If you would like to receive SMS updates, please create a Service Desk ticket with the Client Success Team. Once we have further updates we'll share them with additional announcements. If you have any questions, please create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

Feb 1, 2020

Report: "Elevated Error Rates - Platform Wide"

Last update 2020-02-01T19:53:04.081Z

resolved2020-02-01T19:53:04.064Z

Dear clients, The platform degradation/outage experienced today is now closed. The platform is stable and the error rate has dropped as of 19:23 GMT. Our Tech team will continue to investigate. If you continue to experience issues with the 7digital platform create a Service Desk ticket with the Client Success Team. With best regards, 7digital Client Success Team

investigating2020-02-01T19:29:29.411Z

Dear Clients, We are currently observing an elevated error rate on calls to our API. Our on call support engineers are currently investigating the issue. Once we have completed analysis of the issue we'll send an additional notice out with further details. If you have any questions, please create a Service Desk ticket with the Client Success Team by raising the issue here: https://7digitalops.atlassian.net/servicedesk/customer/portal/6/group/12 Regards, 7digital Client Success Team