Historical record of incidents for Clarifai
Report: "Clarifai documentation at https://docs.clarifai.com/ is down"
Last updateThis incident has been resolved.
While we are working with gitbook to fix the issue, we found a workaround. Go to the https://docs.clarifai.com/api-guide/api-overview/api-clients (it should be accessible) and then click 'Clarifai Guide' on the top-left corner. It should take you to the https://docs.clarifai.com/
Due to issues on gitbook host, our documentation is not available. We are working with gitbook on resolving this issue.
Report: "Failures for search and predict operations"
Last updateIssues was identified. All services are operational now.
We are currently investigating this issue.
Report: "Cannot login to Portal"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are continuing to investigate this issue.
User cannot login to portal and SSO buttons (Google, Github) are not appearing on login screen.
Report: "Database issues affecting input loading"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
At 6:30 am EDT, we experienced an outage on a database node as a result of an automated failover on a redundant node. This outage is affecting all customers and is currently preventing write and search operations on our cloud platform. All prediction operations continue to function normally. Our team is working diligently with our service provider to restore service. In the meantime, we will continue to keep you updated as we restore our systems. We understand the impact this is potentially having on some customers. Please be assured we are doing everything within our power to resolve this incident as soon as possible.
We are working through a database issue that is affecting all queries.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "System-wide issues affecting Visual Search and Input operations"
Last updateThe fix from earlier today has been verified as stable. All systems are running normally now.
A fix has been implemented by our vendor and we are continuing to monitor it. All visual search and input operations should be working again as expected.
At 10:03a.m. EST this morning we noticed that our visual search and input functionality was failing and traced the cause back to a 3rd-party vendor. We are waiting on their confirmation for a fix and we apologize for the inconvenience caused by this.
Report: "System-wide issues affecting Visual Search and Input operations"
Last updateThis incident has been resolved.
We have identified the source of the issue and have implemented a fix for it. We will continue to monitor this to ensure that it is stable.
We have been experiencing issues with our Visual Search platform starting around 9:50am EST and are still investigating a potential fix. Any endpoints using inputs are also affected by this and our Explorer UI may experience slowness/loading issues as well. Apologies for any inconvenience caused by this.
Report: "Visual Search Issues"
Last updateAll visual search operations are running as expected again as of 3:27pm EST. Postmortem to follow shortly.
We have been experiencing issues with our Visual Search platform starting around 11:45am EST and are still investigating a potential fix. Apologies for any inconvenience caused by this.
Report: "Visual Search Latency"
Last updateThis incident has been resolved.
Additional Visual Search capacity has been added and our engineering team is monitoring request latency.
Our engineering team has identified an issue related to Visual Search. During this time, users may receive error messages when attempting to issue Visual Search related API operations. We apologize for the inconvenience.
Report: "Intermittent issues this morning"
Last updateAround 6:00am EDT we started to experience some issues with our Custom Training, Visual Search and Explorer UI tools that would have likely interrupted access to them. This was resolved around 10:40am EDT and we are still investigating the cause/future solutions with one of our vendors. Apologies for any inconveniences that this caused.
Report: "Explorer, Training and Search Down"
Last updateThis incident has been resolved.
A fix has been applied and all systems are running normally again.
Around 3:20p.m. EST we started to experience similar issues to yesterday's outage and are actively pursuing a fix. More information to come shortly.
Report: "Partial System Outage"
Last updateThis incident has been resolved.
We have successfully restored service to the affected areas and are monitoring to ensure that it remains stable. Detailed post-mortem to follow.
We have identified the root cause of today's issues and are currently working on a fix.
Our Operations Team is currently looking into some issues with one of our third party vendors. Affected platforms: Custom Training, Visual Search, Explorer Tool We apologize for any inconvenience caused by this.
Report: "Partial Outage"
Last updateWe are still waiting on a root cause and will update with our postmortem accordingly. Apologies again for any inconveniences.
Service was restored around 5:11 p.m. EST. We opened a ticket with our database vendor to determine the root cause and are still waiting to hear back from them.
Around 4:20pm EST we noticed a large spike in gateway timeout errors from our internal system logs. We are currently investigating this with a high priority and apologize for any inconvenience caused. Affected platforms: Visual Search, Custom Training, Input Uploads, Explorer Tool
Report: "Partial Visual Search Outage"
Last updateThe issue has been resolved and all search operations should be working again. Post-mortem to follow.
We are currently experiencing some issues with our Search platform and are looking into the root cause. Apologies for the inconvenience.
Report: "Database Deadlocked"
Last updateThe issue should now be resolved.
We are currently investigating some database issues that occurred from a recent migration. As a result you may not be able to access your Clarifai applications in the interim.
Report: "Minor System Outage"
Last updateThe issue has now been resolved. We've confirmed that it randomly affected all database-related features (search, post inputs, etc.) between 5:28pm EST and 6:58pm EST.
We've identified that the problem is on the vendor's end and it appears that it has been resolved. We are continuing to monitor the situation.
Around 6:25pm EST we discovered that database read/writes are randomly failing. We've opened a high priority ticket with our vendor and will continue to investigate.
Report: "Database Maintenance"
Last updateDue to a required database update, we needed to pause all requests involving database changes from Nov 21 00:00:00 -- Nov 21 00:45:00 US Eastern Time. During the maintenance window, users were unable to add, modify or delete inputs and concepts, and any such calls would have resulted in an HTTP code 401, error code 39998, and an error message of “Input writes are disabled for maintenance. Please try again in a few hours.”. Predictions (public and custom) and searches with existing inputs remained unaffected during this period.
Report: "Partial System Outage"
Last updateFrom 2:46pm EST to 4:22pm EST we were experiencing intermittent issues with our API, which resulted in some public model failures. A fix has been applied for this and we are continuing to investigate the root cause.
Report: "Partial System Outage"
Last updateThe issue was resolved around 4:55pm EST
We have identified the problem and have implemented a fix it. More information will be provided shortly on the cause.
Around 4pm EST today we started seeing issues with our User Interface and subsequently, our API. We are currently looking into these and will update when we have more information. Apologies for the inconvenience caused by this.
Report: "Logo Model experiencing some issues"
Last updateThe code fix that we deployed has successfully fixed the issue and we have ensured that a similar case will be avoided in the future.
We instituted a fix for the issue at hand at 4:05pm EST and are monitoring its success. At the moment the model is making successful predictions again.
Around 12:55pm EST today, our public Logo Model stopped returning successful prediction results. We are still investigating the matter and should have more information shortly.
Report: "System outage"
Last updateStarting at 4:08PM Eastern Daylight Time, an internal service misconfiguration resulted in a system-wide outage. Proximate cause was identified at 4:32PM. A first remediation was attempted at 4:45PM, but it did not resolve the outage. Root cause was identified at 4:53PM and services fully recovered by 4:55PM. We will perform an internal review to improve internal tools and prevent the incident from reoccurring. Additional monitoring will be put in place to detect problems and identify root causes earlier. Apologies for any inconvenience that this caused.
We are currently investigating the causes of a recent system-wide outage and should have more information shortly. All systems should be running smoothly again. Apologies for any inconvenience that this caused.
Report: "Server Errors - Some API Calls Affected"
Last updateWe have verified that this issue is resolved. Thank you for your patience. If you experience reoccurring issues please reach out to us at support@clarifai.com.
We have a fix implemented and are currently monitoring the issue. Please contact support directly if you receive this error so we can better serve you directly. Again, thank you for your incredible patience in this matter.
We've discovered that this is related solely to local file uploads, so if you are receiving this you can use URLs as a temporary workaround for now.
We are currently investigating some server issues that are triggering a "Sorry, the server is too busy at the moment. Please try again later" message, and we hope to have them resolved soon. Thank you for your patience.
Report: "API Gateway Timeouts - Some API Calls Affected"
Last updateThis issue has been resolved. Please contact support@clarifai.com if you are still seeing requested that time out. Thanks for your patience.
We have implemented some fixes to this issue and are currently monitoring whether the issue is completely resolved for all users.
We upgraded our infrastructure last week and have noticed random gateway timeouts for a small subset of our API requests. We are currently investigating this matter and hope to have it resolved soon. Thank you for your patience.
Report: "User Interface Login Issues"
Last updateLast night around 8:00pm EST our continuous integration plugin deployed a successful build and unknowingly caused problems with our cloudfront cache, which prevented anybody from logging into the UI. We worked on a fix for around 12 hours and the issue is now resolved.