Is Nanonets Down Right Now? Discover if there is an ongoing service outage.

Nanonets is currently Operational

Last checked Jul 29, 2025 14:53 UTC from Nanonets's official status page

Historical record of incidents for Nanonets

Jul 26, 2025

Report: "India region services affected"

Last update 2025-07-26T09:08:38.505Z

monitoring2025-07-26T09:08:38.488Z

A fix has been implemented and we are monitoring the results.

identified2025-07-26T09:07:38.921Z

The issue has been identified and a fix is being implemented.

Jul 23, 2025

Report: "Intermittent Prediction Failures on Instant Learning Models"

Last update 2025-07-23T16:05:51.006Z

resolved2025-07-23T16:02:34.785Z

Between 15:05 UTC and 15:40 UTC on July 23, 2025, a small subset of users using the Sync API for Instant Learning models on app.nanonets.com may have experienced intermittent prediction failures. We identified an edge case in our system that triggered this issue under specific conditions. The issue was resolved promptly with a fix, and the system has been stable since. All other regions and services remained unaffected and continue to operate normally. We apologize for the inconvenience caused and appreciate your understanding.

Jul 22, 2025

Report: "Temporary Slowness on app.nanonets.com due to High Load on DB"

Last update 2025-07-22T10:21:34.517Z

resolved2025-07-22T10:21:34.506Z

On July 22nd, 2025, users of app.nanonets.com may have experienced intermittent slowness during the periods 08:45–08:56 UTC and 10:03–10:11 UTC. This issue was caused by an unusually high load on our primary database. Our engineering team promptly identified the root cause and deployed a fix. The system has since stabilized and is functioning normally. We appreciate your patience during this time. All other regions remained fully operational without any downtime.

Jun 24, 2025

Report: "EU Region Prediction API Failures"

Last update 2025-06-24T13:58:34.353Z

resolved2025-06-24T13:58:34.336Z

Between 13:33 UTC and 13:44 UTC on June 24th 2025, we observed intermittent failures in the Prediction API for our EU region due to a critical infrastructure issue. Our alerting systems flagged the anomaly, and our engineers quickly identified and resolved the problem. Services in the EU region were fully restored by 13:44 UTC. Other regions remained stable and were not impacted during this period. We appreciate your patience and understanding.

Report: "EU Region Prediction API Failures"

Last update 2025-06-24T13:58:00.000Z

Resolved2025-06-24T13:58:00.000Z

Jun 12, 2025

Report: "GCP Global Outage – No Impact to Our Systems"

Last update 2025-06-12T19:10:00.965Z

monitoring2025-06-12T19:10:00.962Z

We are continuing to monitor for any further issues.

monitoring2025-06-12T19:08:36.113Z

There is a major GCP global outage ongoing at the moment. We have ensured that all our critical dependencies have been redirected to fallbacks, and we are currently not impacted. GCP incident: https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1SsW We are closely monitoring the situation and will take any necessary actions if the situation evolves. Thanks for your continued support and vigilance. We'll keep you posted with any updates.

Jun 2, 2025

Report: "Degraded async file processing for few users on https://eu-open.nanonets.com"

Last update 2025-06-02T16:59:04.758Z

resolved2025-06-02T16:59:04.742Z

This incident has been resolved.

identified2025-06-02T15:17:07.380Z

Between 1 PM and 7 PM IST on June 2nd, very few users in our EU-Open region (https://eu-open.nanonets.com) experienced delayed file processing via the Async Prediction API due to a critical infrastructure issue that caused a backlog. This issue was limited to the EU-Open region, with all other regions operating normally. We have identified and resolved the root cause, scaled up the system, and are working to clear the backlog as quickly as possible.

Report: "Degraded async file processing for few users on https://eu-open.nanonets.com"

Last update 2025-06-02T11:59:00.000Z

Resolved2025-06-02T11:59:00.000Z

This incident has been resolved.

Identified2025-06-02T10:17:00.000Z

May 23, 2025

Report: "Temporary disruption in file operations on app.nanonets.com"

Last update 2025-05-23T13:10:48.961Z

resolved2025-05-23T12:55:42.793Z

On May 23rd, 2025, between 11:45 AM UTC and 12:10 PM UTC, we experienced a brief disruption affecting file information retrieval and related actions such as verification. The issue was promptly identified and resolved. All services are now fully operational, and we have implemented measures to prevent a recurrence. We apologize for the inconvenience caused and appreciate your patience.

Report: "Temporary disruption in file operations on app.nanonets.com"

Last update 2025-05-23T07:55:00.000Z

Resolved2025-05-23T07:55:00.000Z

May 16, 2025

Report: "File processing degradation for instant learning models on US region"

Last update 2025-05-16T16:14:43.533Z

resolved2025-05-16T16:14:43.515Z

This incident has been resolved.

investigating2025-05-16T14:59:52.359Z

We are currently investigating this issue.

May 7, 2025

Report: "Post-Processing Service Degradation and Export Failures on app.nanonets.com"

Last update 2025-05-07T14:45:42.758Z

resolved2025-05-07T14:38:17.134Z

On May 7th 2025, between 13:05 PM UTC and 14:15 PM UTC, we experienced temporary failures in our post-processing (PP) service, leading to some export failures. The root cause was a sudden spike in traffic that overwhelmed the PP service, exposing a bottleneck that choked request handling. This has since been identified and addressed with a patch to improve traffic resilience. We regret the disruption caused and apologize for the inconvenience.

Apr 30, 2025

Report: "File processing degradation for instant learning models"

Last update 2025-04-30T13:32:41.508Z

resolved2025-04-30T12:30:00.000Z

From 5:30 AM - 5:45 AM PDT, instant learning models might've experienced degraded performance - higher latency coupled with intermittent failures. It was caused by high load on one of our secondary databases. This issue has since been resolved.

Apr 29, 2025

Report: "Delayed file processing for all models using async api in app.nanonets.com"

Last update 2025-04-29T14:53:38.790Z

resolved2025-04-29T14:53:38.778Z

Between 02:05 PM UTC and 02:42 PM UTC, some files experienced delayed processing on app.nanonets.com due to a sudden spike in load, which triggered system upscaling. This temporarily caused a backlog. We've identified the bottleneck, cleared the queue. All systems are operational now.

Apr 11, 2025

Report: "Delayed file processing for all models using async api"

Last update 2025-04-11T13:20:20.249Z

postmortem2025-04-11T13:17:50.598Z

On April 11th, 2025, between 11:35 AM UTC and 12:45 PM UTC, users experienced delays in file processing across all models on the async prediction API of [app.nanonets.com](http://app.nanonets.com). There is no impact on EU and IN region. The issue was caused by an internal service dependency that became bottlenecked, leading to a backlog in the processing queue. The root cause was promptly identified, and a fix was deployed to restore normal processing. We sincerely apologize for the inconvenience caused and we are taking steps to prevent similar occurrences in the future.

resolved2025-04-11T13:10:47.460Z

This incident has been resolved.

monitoring2025-04-11T12:59:51.682Z

A fix has been implemented and we are monitoring the results.

identified2025-04-11T12:47:07.895Z

The issue has been identified and a fix is being implemented.

investigating2025-04-11T12:44:36.846Z

We are currently investigating this issue.

Apr 7, 2025

Report: "Intermittent prediction failures for instant learning models on app.nanonets.com"

Last update 2025-04-07T11:52:59.815Z

resolved2025-04-02T05:30:00.000Z

We observed intermittent prediction failures for instant learning models on app.nanonets.com between 11:10 AM and 11:45 AM IST on 2nd April 2025 due to a sudden spike in system load. Our team quickly identified the issue and scaled the system beyond the threshold to stabilize performance. We've also implemented additional checks to proactively handle such scenarios in the future. We sincerely apologize for the inconvenience caused.

Mar 27, 2025

Report: "Instant learning model prediction failures in app.nanonets.com"

Last update 2025-03-27T17:03:01.140Z

resolved2025-03-27T17:03:01.123Z

This incident has been resolved.

monitoring2025-03-27T17:01:43.425Z

A fix has been implemented and we are monitoring the results.

identified2025-03-27T16:57:53.996Z

The issue has been identified and a fix is being implemented.

investigating2025-03-27T16:19:29.368Z

We are currently investigating this issue.

Mar 21, 2025

Report: "app.nanonets.com is slow"

Last update 2025-03-21T10:53:06.117Z

resolved2025-03-21T10:53:06.099Z

This incident has been resolved.

monitoring2025-03-21T10:44:25.787Z

A fix has been implemented and we are monitoring the results.

investigating2025-03-21T10:37:19.511Z

We are currently investigating this issue.

Mar 10, 2025

Report: "High Response Time on Instant Learning Models in EU Region"

Last update 2025-03-10T18:19:08.642Z

resolved2025-03-10T18:19:08.626Z

This incident has been resolved.

monitoring2025-03-10T18:09:18.036Z

We are continuing to monitor for any further issues.

monitoring2025-03-10T18:09:04.038Z

A fix has been implemented and we are monitoring the results.

investigating2025-03-10T16:29:38.226Z

We are currently investigating this issue.

Mar 4, 2025

Report: "Intermittent 5xx errors on app.nanonets.com"

Last update 2025-03-04T15:34:36.173Z

resolved2025-03-04T15:34:36.155Z

This incident has been resolved.

monitoring2025-03-04T15:28:59.648Z

We observed intermittent 5xx errors on app.nanonets.com between 8:26 PM IST and 8:46 PM IST. The issue has been identified and fixed, and we are actively monitoring the system to ensure stability.

Feb 24, 2025

Report: "Platform downtime issue for app.nanonets.com"

Last update 2025-02-24T11:01:45.196Z

resolved2025-02-24T10:45:00.000Z

App is down in US region Investigating - We are currently investigating this issue. 10:38 AM UTC Monitoring - A fix has been implemented and we are monitoring the results. 10:43 UTC

Feb 21, 2025

Report: "High Response Times for Non Instant learning models in US region"

Last update 2025-02-21T15:28:43.960Z

postmortem2025-02-21T15:27:51.933Z

On 21st Feb 2025, from 6:00 PM IST to 08:30 PM IST, non-instant learning models on [app.nanonets.com](http://app.nanonets.com) experienced high response times and timeouts due to extreme load on our system. Although auto-scaling was triggered on time, one of our services choked under the increased throughput. Our engineers quickly identified the bottleneck and increased the service's throughput, normalizing the response times. We apologize for the inconvenience and we are implementing measures to enhance system resilience and better handle future traffic spikes.

resolved2025-02-21T15:07:36.623Z

This incident has been resolved.

monitoring2025-02-21T15:04:28.691Z

We are continuing to monitor for any further issues.

monitoring2025-02-21T15:04:13.946Z

A fix has been implemented and we are monitoring the results.

investigating2025-02-21T14:19:05.272Z

We are currently investigating this issue.

Feb 14, 2025

Report: "Intermitent API Errors"

Last update 2025-02-14T06:35:57.500Z

resolved2025-02-14T06:35:57.485Z

This incident has been resolved.

investigating2025-02-14T06:34:19.775Z

We are continuing to investigate this issue.

investigating2025-02-14T06:32:17.052Z

We are currently investigating this issue.

Feb 10, 2025

Report: "Elevated response times for instant learning models"

Last update 2025-02-10T15:51:22.397Z

resolved2025-02-10T15:51:22.378Z

This incident has been resolved.

monitoring2025-02-10T15:42:42.812Z

A fix has been implemented and we are monitoring the results.

identified2025-02-10T15:39:16.610Z

The issue has been identified and a fix is being implemented.

investigating2025-02-10T15:07:52.961Z

We are currently investigating this issue.

Feb 7, 2025

Report: "Delayed File Processing For async Models and Data actions/approval run issues for sync models"

Last update 2025-02-07T10:06:29.335Z

resolved2025-02-06T11:30:00.000Z

On 6th-Feb-2025, multi-page files uploaded through the Async API experienced delayed processing due to an issue at our end. This occurred between 5:07 PM IST and 6:20 PM IST, affecting app.nanonets.com , eu.nanonets.com, eu-open.nanonets.com During this period, file processing delays may have led to increased response times. Data actions and approval flow didn’t run for file uploaded on sync mode Resolution: The issue was identified and resolved, ensuring that all new file uploads are now processing correctly. Additionally, all impacted files uploaded during the affected window have been successfully processed. To prevent such occurrences in the future, we are implementing stronger regression checks before releases and reinforcing our monitoring systems. We sincerely apologize for the inconvenience caused and appreciate your patience and understanding.

Feb 5, 2025

Report: "Failures in Instant learning model on app.nanonets.com"

Last update 2025-02-05T07:26:35.326Z

postmortem2025-02-05T07:15:07.536Z

### **Incident Summary:** Between 16:30 UTC and 22:30 UTC 4th Feb, users experienced high response times and some failures when using Instant Learning models on [app.nanonets.com](https://app.nanonets.com). This was caused by unusually high latency in our secondary database when processing certain requests. Our other models on [app.nanonets.com](http://app.nanonets.com) are working fine and our other regions \(EU, EU-Open, IN\) are operational without any issues. ### **Resolution:** Our engineering team identified the issue and implemented a fix to restore normal response times. We are planning to schedule a maintenance upgrade for our secondary database to prevent similar incidents in the future. We apologize for the inconvenience and appreciate your patience.

resolved2025-02-04T22:37:24.927Z

This incident has been resolved.

monitoring2025-02-04T17:06:22.810Z

A fix has been implemented and we are monitoring the results.

identified2025-02-04T17:02:50.221Z

The issue has been identified and a fix is being implemented.

investigating2025-02-04T16:58:03.964Z

We are continuing to investigate this issue.

investigating2025-02-04T16:57:30.493Z

We are continuing to investigate this issue.

investigating2025-02-04T16:48:26.287Z

We are currently investigating the issue

Feb 1, 2025

Report: "App is down in US region"

Last update 2025-02-01T02:21:01.769Z

resolved2025-02-01T02:21:01.755Z

This incident has been resolved.

monitoring2025-02-01T02:19:23.319Z

A fix has been implemented and we are monitoring the results.

identified2025-02-01T02:17:31.317Z

The issue has been identified and a fix is being implemented.

investigating2025-02-01T02:03:40.112Z

We are currently investigating this issue.

Jan 29, 2025

Report: "High Load Causing 500 Errors & Latency on app.nanonets.com"

Last update 2025-01-29T10:42:40.910Z

resolved2025-01-29T10:30:00.000Z

Between 3:47 PM and 04:02 PM IST, intermittent 500 errors and high response times were observed for models on app.nanonets.com due to high system load. We have scaled our infrastructure to handle the increased demand and we are implementing proactive measures to manage such loads more effectively in the future.

Jan 23, 2025

Report: "Table extraction errors for instant learning models"

Last update 2025-01-23T06:39:55.929Z

postmortem2025-01-22T11:24:26.871Z

**Incident Summary:** On January 22, 2025, between 10:30 AM IST and 12:30 PM IST, users encountered failures in extracting table data from files uploaded to the Instant Learning Models. All other APIs and services remained fully operational during this period. **Resolution:** This issue specifically affected documents uploaded to the Instant Learning Models within the stated timeframe. Our engineering team promptly identified the root cause and implemented a resolution, fully restoring the service's functionality. The platform is now operating normally. We sincerely apologize for the inconvenience caused by this incident and regret the disruption it may have caused to our users. Ensuring the high availability, reliability and performance of our services remains our highest priority.

resolved2025-01-22T07:00:00.000Z

We're experiencing errors with table-data extraction one instant learning models

Jan 21, 2025

Report: "File prediction errors for instant learning models in US region"

Last update 2025-01-21T11:17:41.954Z

postmortem2025-01-21T11:08:53.224Z

**Incident Summary:** Between **03:50 PM IST and 04:08 PM IST** on January 21, 2025, users of `app.nanonets.com` experienced failures in processing requests related to Instant Learning Models. This issue was isolated to our secondary db, which encountered an operational issue, causing request failures. All other APIs and services remained unaffected. **Resolution:** The above issue impacted reads for Instant Learning Models. Our engineering team promptly resolved the read replica issue, restoring full service functionality. The platform is now operating as expected. We sincerely apologize for the inconvenience caused and deeply regret the disruption to our users. Ensuring high availability and reliability of our services is our top priority.

resolved2025-01-21T10:49:13.593Z

This incident has been resolved.

monitoring2025-01-21T10:42:41.716Z

A fix has been implemented and we are monitoring the results.

identified2025-01-21T10:37:34.258Z

The issue has been identified and a fix is being implemented.

investigating2025-01-21T10:32:24.382Z

We are currently investigating this issue.

Jan 13, 2025

Report: "App is down in EU region."

Last update 2025-01-13T14:42:00.334Z

postmortem2025-01-13T14:36:51.453Z

**Incident Summary:** On 13-Jan-2025 07:42 PM IST an incident occurred where our application became temporarily unavailable. This affected the following services: * [https://eu.nanonets.com](https://eu.nanonets.com) * [https://eu-open.nanonets.com](https://eu-open.nanonets.com) The outage was caused by an accidental deletion of an authorization token required for database authentication. This token is essential for ensuring secure and seamless communication between the application and the database. **Resolution:** A new authorization token was generated and securely updated in the system. Services were restarted, and normal functionality was verified. We will introduce stricter permissions to ensure only authorized personnel can manage critical tokens. We sincerely apologize for the inconvenience caused by this incident. We acknowledge the importance of our services to your operations and deeply regret this unintentional disruption. Please be assured that we are taking all necessary steps to prevent such incidents in the future. **Timeline:** 13-Jan-2025 07:42 PM IST to 13-Jan-2025 07:54 PM IST

resolved2025-01-13T14:30:51.979Z

This incident has been resolved.

monitoring2025-01-13T14:27:57.359Z

A fix has been implemented and we are monitoring the results.

identified2025-01-13T14:24:17.923Z

The issue has been identified and a fix is being implemented.

investigating2025-01-13T14:20:54.194Z

Only https://eu.nanonets.com and https://eu-open.nanonets.com are affected. https://app.nanonets.com is working as expected

Jan 8, 2025

Report: "app.nanonets not working"

Last update 2025-01-08T10:35:30.407Z

resolved2025-01-08T10:35:30.387Z

This incident has been resolved.

monitoring2025-01-08T06:13:47.000Z

Root Cause: Our primary databases experienced an unexpected outage during a planned update aimed at improving service performance. Resolution: The databases have been fully restored and are functioning normally. To ensure a smoother process in the future, we will schedule similar updates with advance notice and enhanced safeguards, including comprehensive backups.

monitoring2025-01-08T06:13:36.804Z

A fix has been implemented and we are monitoring the results.

investigating2025-01-08T05:30:22.807Z

We are currently investigating this issue.

Jan 7, 2025

Report: "Instant learning models in US region experienced delays in file processing."

Last update 2025-01-07T14:36:39.464Z

resolved2025-01-03T00:00:00.000Z

Root Cause: One of our internal systems, responsible for handling a critical component of our processing pipeline experienced an unexpected failure. This created back-pressure on our upstream systems, resulting in delays and partial failures in file processing. Resolution: The issue was addressed by restoring the affected instance. Moving forward, we will have strong fallback mechanisms to address this. Timeline - Jan 3rd 05:30 AM IST to 6:25 AM IST

Dec 18, 2024

Report: "IN region Outage"

Last update 2024-12-18T15:12:49.702Z

resolved2024-12-18T15:12:31.367Z

Our application experienced downtime in the IN region (https://in.nanonets.com) due to an issue with redis caused due to a lot of connections from the API server being mistaken with a security issue by redis. We have restarted the redis and the API servers to restore the connections. Timeline: 12:18PM IST - 12:27PM IST

Dec 12, 2024

Report: "Issue with Instant Learning Models"

Last update 2024-12-12T01:24:21.217Z

postmortem2024-12-12T01:20:57.041Z

We use multiple services to power our instant learning models. One of them is Open AI and on 11th Dec 15:17 PST\(PST\) Open AI went down\([https://status.openai.com/](https://status.openai.com/)\) . This caused upstream pressure on our inferences leading to long wait times and failures.

resolved2024-12-12T01:20:27.637Z

This incident has been resolved.

monitoring2024-12-12T00:32:41.000Z

There is downtime in a downstream service (Open AI, https://status.openai.com/) leading to degraded performance on our extraction results

monitoring2024-12-12T00:27:04.732Z

There is downtime in a downstream service (Open AI, https://status.openai.com/) leading to degraded performance on our extraction results

Dec 3, 2024

Report: "Instant learning model prediction issues in EU and EU-Open Regions."

Last update 2024-12-03T04:31:06.752Z

postmortem2024-12-03T04:26:53.129Z

**Incident Summary**: Instant learning models in EU and EU-Open regions experienced failures, resulting in request timeouts and service disruptions during the incident. **Root Cause**: A dependency conflict occurred when an external library update introduced incompatibility with a critical system component, leading to failure in processing requests. **Resolution:** The issue was addressed by pinning the affected library versions to known stable releases, ensuring compatibility across components. Moving forward, we have strengthened our dependency management processes to include proactive monitoring and testing of third-party updates to prevent similar issues

resolved2024-12-02T10:04:01.007Z

This incident has been resolved.

monitoring2024-12-02T09:15:57.885Z

A fix has been implemented and we are monitoring the results.

identified2024-12-02T08:59:46.205Z

The issue has been identified and a fix is being implemented.

investigating2024-12-02T08:45:44.962Z

We are currently investigating this issue.

Nov 24, 2024

Report: "Service Degradation on app.nanonets.com"

Last update 2024-11-24T11:47:39.333Z

postmortem2024-11-24T11:46:34.605Z

We sincerely apologize for the inconvenience caused by today's service disruption between **14:20 IST and 17:00 IST**. A feature deployment caused unexpected pressure on our database, leading to request timeouts and elevated 5xx errors. We have since rolled back the feature and restored normal operations. To prevent such issues in the future, we are strengthening our testing procedures and monitoring systems. Thank you for your understanding and continued support. If you have any further questions or concerns, please contact our support team.

resolved2024-11-24T11:28:09.143Z

This incident has been resolved.

monitoring2024-11-24T10:21:55.047Z

A fix has been implemented and we are monitoring the results.

identified2024-11-24T10:18:51.586Z

We have identified the issue and working on a fix

investigating2024-11-24T10:08:11.232Z

We are currently investigating this issue.

Nov 19, 2024

Report: "EU server downtime"

Last update 2024-11-19T15:04:54.400Z

resolved2024-11-19T15:04:54.385Z

This incident has been resolved.

investigating2024-11-19T15:04:39.419Z

We are continuing to investigate this issue.

investigating2024-11-19T15:01:32.889Z

We are continuing to investigate this issue.

investigating2024-11-19T15:01:10.685Z

We are currently investigating this issue.

Nov 13, 2024

Report: "Downtime for EU instance"

Last update 2024-11-13T10:34:43.750Z

postmortem2024-11-13T10:31:05.917Z

**Incident Summary:** One of our database nodes went down, resulting in our API’s to [https://eu.nanonets.com](https://eu.nanonets.com) returning 5xx responses intermittently. This issue was quickly identified, and we implemented a rapid fix by restarting the affected node. Only EU region got impacted, rest all regions are working as expected. **Root Cause:** The database node experienced a high load, reaching its capacity limits, which led to the node becoming unavailable and causing disruptions in API service. **Resolution:** The node was immediately restarted to restore service, and we have scheduled a capacity upgrade on **November 17th** during a planned maintenance window to prevent future occurrences. We apologize for any inconvenience caused and appreciate your understanding. **Timeline:** 15:35 IST - 15:50 IST

resolved2024-11-13T10:24:50.516Z

This incident has been resolved.

investigating2024-11-13T10:00:27.000Z

We are currently investigating this issue.

Nov 8, 2024

Report: "File processing issue with Instant Learning and Zero Training models"

Last update 2024-11-08T06:15:26.266Z

postmortem2024-11-08T06:11:51.783Z

At around 10:46 UTC on 7th Nov, one of our queueing systems experienced heavy load which led to requests getting queued and frequently timing out for Instant Learning and Zero Training models. We got alerted to it and quickly scaled it up, and by 11:15 UTC, the backlog got cleared and the incident was resolved. We are adding additional alerting to this queueing system to make sure that we can catch these type of issues well before the queue backs up.

resolved2024-11-07T11:15:01.467Z

This incident has been resolved.

monitoring2024-11-07T11:09:44.247Z

We are continuing to monitor for any further issues.

monitoring2024-11-07T11:09:28.651Z

A fix has been implemented and we are monitoring the results.

investigating2024-11-07T10:46:50.214Z

We are currently investigating this issue.

Nov 6, 2024

Report: "Intermittent Image Rendering Issue"

Last update 2024-11-06T11:46:50.587Z

resolved2024-11-06T08:30:00.000Z

Our application is currently experiencing intermittent image rendering issues due to disruptions at our third-party image provider, ImageKit. While the API and other functionalities remain unaffected, images may occasionally fail to load for end-users and you may experience app slowness. We are closely monitoring the situation and are in contact with ImageKit to ensure a swift resolution. You can view their real-time status updates here: https://imagekitio.statuspage.io/

Nov 5, 2024

Report: "IN region Outage"

Last update 2024-11-05T13:07:35.571Z

resolved2024-11-05T12:50:00.000Z

Our application experienced downtime in the IN region (https://in.nanonets.com) due to a database outage caused by an unexpected spike in load. The increased load resulted in the database going down, affecting application availability. We have increased the capacity and restarted the affected node to restore service. Timeline: 06:20PM IST - 06:28PM IST

Oct 22, 2024

Report: "User facing auto logged out issues"

Last update 2024-10-22T17:05:17.186Z

postmortem2024-10-22T16:54:09.914Z

**Incident Summary**: On 22nd Oct 2024 around 3:43pm \(IST\) some users started facing random log outs events **Root Cause**: There was a critical downstream service that failed to scale during the high load phase. This resulted in some 5xx errors causing the user to be logged out. **Resolution**: We have provisioned enough resources to handle the load now and are planning set up auto scaling for it. **Timeline:** OCT 22nd 3:43PM IST - OCT 22nd 5:15 PM IST

resolved2024-10-22T11:45:44.615Z

This incident has been resolved.

investigating2024-10-22T10:13:09.851Z

We are currently investigating this issue.

Oct 8, 2024

Report: "Intermittent file upload failures"

Last update 2024-10-08T07:12:30.427Z

postmortem2024-10-08T07:09:23.390Z

**Incident Summary**: On 07-OCT-2024 11:30PM IST, a new feature was deployed, which inadvertently caused a spike in our RDS usage. This led to some APIs timing out, causing 5xx errors, and resulting in random file upload failures for affected users. **Root Cause**: The root cause of the issue was a feature deployed by one of our developers that had an unintended effect on the database. The feature introduced inefficiencies in database queries, leading to a significant increase in RDS usage. The increased load caused API requests to exceed their time limits, resulting in timeouts and random file upload failures. **Resolution**: As soon as the issue was identified, we reverted the change and restored normal operations. The system returned to a stable state shortly after the rollback. We are now implementing additional safeguards and testing procedures to prevent similar incidents in the future, ensuring we do not experience such unanticipated impacts again. **Timeline:** OCT 7th 11:50PM IST - OCT 8th 12:30 AM IST

resolved2024-10-07T19:24:18.128Z

This incident has been resolved.

monitoring2024-10-07T19:24:03.211Z

A fix has been implemented and we are monitoring the results.

investigating2024-10-07T18:55:47.496Z

We are currently investigating this issue.

Sep 13, 2024

Report: "Instant learning models upload was failing"

Last update 2024-09-13T11:07:38.389Z

resolved2024-09-13T10:51:08.000Z

Incident Summary: An enhancement to our instant learning model inadvertently caused issues, leading to upload failures or files being stuck in a queued state. This affected files uploaded between 03:55 PM IST - 04:15 PM IST. Root Cause: The issue was traced back to a recent enhancement for document level processing in our instant learning model that caused unexpected behaviour. Resolution: We have identified and reverted the problematic enhancement. Additionally, all affected files from the specified timeframe shall be retried. Moving forward, we will implement stricter controls in our deployment process and enhance our instrumentation to detect similar issues earlier.

Sep 10, 2024

Report: "Blank predictions on Eu- open"

Last update 2024-09-10T09:21:25.563Z

resolved2024-09-10T09:20:32.000Z

This incident has been resolved. Interval : 12.06-2.45pm IST

identified2024-09-10T09:00:51.947Z

The issue has been identified and a fix is being implemented.

Aug 19, 2024

Report: "Partial Downtime"

Last update 2024-08-19T03:33:33.657Z

resolved2024-08-19T03:33:33.640Z

This incident has been resolved.

investigating2024-08-19T03:30:45.127Z

We are continuing to investigate this issue.

investigating2024-08-19T02:34:20.316Z

We are currently investigating this issue.

Aug 13, 2024

Report: "Instant Learning Model Enhancement Impact on File Processing"

Last update 2024-08-13T17:35:51.924Z

resolved2024-08-13T17:34:38.000Z

Incident Summary: An enhancement to our instant learning model inadvertently caused issues in some customer models, leading to file processing failures or files being stuck in a queued state. This affected files uploaded between 06:00 PM IST - 09:15 PM IST. Root Cause: The issue was traced back to a recent enhancement in our instant learning model that caused unexpected behaviour in a subset of customer models. This impact was not detected early due to the random nature of the issue affecting only a small number of users. Resolution: We have identified and reverted the problematic enhancement. Additionally, all affected files from the specified timeframe have been retried. Moving forward, we will implement stricter controls in our deployment process and enhance our instrumentation to detect similar issues earlier. We apologize for the inconvenience caused and are taking steps to prevent such incidents in the future.

Aug 12, 2024

Report: "Downtime for EU instance"

Last update 2024-08-12T11:43:01.080Z

resolved2024-08-12T06:45:06.000Z

Issue:Intermittent failures when updating fields, exporting files and error while accessing models Interval: 12.22-1.48pm IST RCA:We have seen partial failures in the DB calls originating from one specific node. Our team terminated this node as an instant fix. We are adding instrumentations to avoid this in long term process.

Aug 2, 2024

Report: "IN region nginx server down"

Last update 2024-08-02T06:37:16.164Z

resolved2024-08-02T06:37:08.708Z

We experienced intermittent downtime on our Nginx server in the IN region. EU and US regions are not affected. The affected Nginx instance has been restarted and is now fully operational and serving requests. Our team is actively investigating the root cause of this issue so that we can make a long term fix for this. We apologize for any inconvenience this may have caused. Duration: 11:34 AM IST - 11:41 AM IST

Aug 1, 2024

Report: "Downtime for EU-open instance"

Last update 2024-08-01T13:06:35.661Z

resolved2024-08-01T11:15:00.000Z

Issue:Instant learning models was down on eu-open instance Time interval:4.34pm-4.45pm IST RCA:nodes were getting scheduling and terminated which resulted in EU open downtime

Jul 25, 2024

Report: "IN region slow response times"

Last update 2024-07-25T22:10:50.255Z

resolved2024-07-25T02:30:00.000Z

High load on load balancers lead to slow response times for 15 mins between 7:45 AM and 8:20 AM

Jul 18, 2024

Report: "Failures in email imports for some customers"

Last update 2024-07-18T13:28:36.706Z

resolved2024-07-18T13:26:53.000Z

Due to a recent config change, we have seen failures in processing for some of the customers using email import feature. The issue has been identified and the fix is deployed.

Jul 15, 2024

Report: "Platform downtime issue"

Last update 2024-07-15T07:33:05.149Z

postmortem2024-07-15T07:19:55.791Z

**Timeline** On 15th July 07:08 - 07:18 \(UTC\) ### Incident Summary New users experienced login issues File uploads were failing ### Root Cause The incident was caused by the failure of one of the database nodes. The system's architecture requires both database nodes to be functional to handle new user logins and file uploads. With one node down, the system could not accept more connections, leading to login failures for new users and failed file uploads. ### Incident Resolution The issue was resolved by restarting the downed database instance. Once the database node was back online, the system resumed normal operations, allowing new users to log in and file uploads to be processed successfully.

resolved2024-07-15T07:18:02.615Z

This incident has been resolved.

identified2024-07-15T07:12:44.681Z

The issue has been identified and a fix is being implemented.

Jul 11, 2024

Report: "Files upload affected"

Last update 2024-07-11T07:52:41.076Z

postmortem2024-07-11T07:48:17.753Z

**Timeline**: Jul 9th 06:52 AM UTC - 09:50 AM UTC **Incident Summary:** Users experienced issues with rate limits, causing most users to be blocked from uploading further files. **Root Cause:** This issue was caused by a faulty deployment, which inadvertently affected the rate limits of different users. **Resolution:** We have identified and rolled back the faulty deployment to restore normal functionality. Additionally, we are implementing stricter measures and more thorough testing protocols to ensure that such mistakes do not occur in the future. We sincerely apologize for the inconvenience caused by this incident. Please be assured that we are taking all necessary steps to prevent such issues from happening again. We appreciate your patience and understanding as we work to improve our processes and deliver the best possible experience to our users.

resolved2024-07-09T07:00:00.000Z

Files upload affected for multiple users

Report: "Elevated response times for instant learning models"

Last update 2024-07-11T07:44:26.781Z

postmortem2024-07-11T05:27:18.362Z

**Incident Summary:** Users experienced elevated response times for instant learning models due to a disruption in our processing system. **Root Cause:** One of our GPU nodes was down, which significantly affected file processing times and led to slower response times for our users using instant learning models. **Resolution:** We promptly identified the machine and removed it from our pool, which restored normal processing times. As a long-term fix, we are implementing a robust mechanism to ensure that any node or machine going down will not impact file processing times. This will include automatic detection and removal of faulty nodes from our pool and redistribution of the workload to healthy nodes. We sincerely apologize for the inconvenience this incident may have caused. We understand the importance of reliable and fast service, and we are taking the necessary steps to prevent such issues from occurring in the future. We appreciate your patience and understanding.

resolved2024-07-09T15:35:53.000Z

This incident has been resolved.

investigating2024-07-09T15:05:13.000Z

We are currently investigating this issue.

Jul 3, 2024

Report: "Platform downtime issue for IN instance"

Last update 2024-07-03T12:51:03.411Z

resolved2024-07-03T10:00:00.000Z

Issue:Platform downtime issue for IN instance Interval:3.13-3.18pm IST

Report: "Downtime for EU-open instance"

Last update 2024-07-03T07:06:15.276Z

resolved2024-07-02T10:30:00.000Z

Issue:Downtime for EU-open instance Interval:4.05-4.10pm IST RCA:nodes were getting scheduling and terminated which resulted in EU open downtime